-
Is open source database a viable solution?
Posted on August 8th, 2008 No commentsPlease read the article at this link.
What do you think? Is open source database a viable solution?
The link says (as quoted by Forrester) that the market share for
- Open source database is at $850 million
- Commercial databases is $ 16 billion
-
Tuning: Min and Max in same query
Posted on July 12th, 2008 No commentsPlease read this article.
Karen explains why having min and max in the same query impacts performance and how to tackle the same.
-
Decfloat datatype not supported in BODI / BODS
Posted on July 12th, 2008 No commentsRecently I was working with a few DB2 UDB 9.5 tables and discovered that decfloat datatype columns were not recognized even though I had checked the option to recognize unsupported datatypes as varchar.
-
Migration of a DW solution from Oracle to DB2
Posted on April 26th, 2008 No commentsRecently we migrated an entire DW module from Oracle to DB2. And our ETL was done using Business Objects Data Integrator.
Below is a sequence of steps that we followed to migrate the entire solution (including data structures, ETL code and universe)
Assumption:
–>All estimates done for 4 DB2 databases, 3 federated systems, 62 tables, 33 indexes, 6 DI jobs, 110 data flows, 151 transformations, 16 SQL transforms, 34 SQL() scripts, 220 objects in 23 classes and 2 reports.
–>Add 20% time as buffer for unexpected issues.
–>Some tasks can be executed in parallel, while some have strong dependency on completion of previous tasks.Work breakdown of migration and rough estimation:
1. Initiation & planning : 40 Hrs
Establish objectives and goals
Requirements gathering
Establish scope
Plan resources
Identify development environment2. Assessment : 40 Hrs
Assess tool and technology needs
Assess and understand technical requirements
Establish technical guidelines
Finalize tools and development environmentBODI, BODS, Business Intelligence, DB2, Data Integration, Data Modelling, Data warehouse, Database, Dimensional modelling, EIM, ETL, Metadata, Oracle, SAP BODI, BODS, Business objects, Data Integration, Data Modelling, Data Quality, Database, DB2, Dimensional modelling, DWH, Effort estimation, EIM, ETL, Metadata, Oracle, SAP-BO -
NULL and NOT IN (Oracle)
Posted on March 26th, 2008 No commentsHere is a very interesting scenario.
I have an employee table called EMP.
And I have a DEPT table.
Now, I write a query to find out all the employees who are part of any department. (DEPT_ID is properly updated and found in DEPT table)
Now, I want to know if any employee is not part of any department.
What??? No results? I know that employee with EMP_ID 9 doesn’t belong to any department. Where did he go?
Is Oracle drunk? Is it giving me wrong results? Wait let me check with another query.
-
Truncate and delete
Posted on July 25th, 2007 No commentsTruncate:
1. Moves the high water mark down to beginning.
2. Doesnt need commit. It is a DDL command.
3. No condition needed. Removes all rows.
4. Extents are deallocated. (If you specify the reuse storage clause, then the extents are not deallocated)
5. No rollback.Delete:
1. Doesnt change the high water mark.
2. Needs commit. It’s a DML command.
3. Can specify a condition and remove specific rows.
4. Extents are not deallocated.
5. Changes can be rolled back. -
Why index doesnt access null values?
Posted on July 17th, 2007 No commentsOracle indexes do not contain entries that point to rows with null values on all the indexed columns.
Therefore any query that has a column IS NULL or IS NOT NULL will not use the index.
-
The maximum size of a fact table.
Posted on July 11th, 2007 No commentsWell, this is just a workout. It may or may not make much sense!
Here I have detailed how we can calculate the maximum size of a fact table.
Assume our datamart has 5 dimensions: Time_dim, customer_dim, product_dim, product_warehouse_dim, supplier_dim. And our fact table has 5 foreign keys, 15 measures, 2 degenerate columns
Time_dim has 20 years of rows on daily grain: so 365 X 20 = 7300 rows
There are 4000 customers in customer_dim.
200 products.
10 warehouses.
12 suppliers.So the maximum possible fact table rows is
7300 x 4000 x 200 x 10 x 12 = 7008000000000 (7008000 million or 7008 billion)Now this is the maximum no of rows possible in the fact table.(Assuming that every customer brought every product from every store supplied by every supplier on every day…..etc… it goes on … permutations and combinations)
Assuming that the 22 columns have an average size of 5 bytes, we need 110 bytes per row.
So now, 110 x 7008 billion = 71794 GB
The maximum possible size for this fact table (assuming that dimension count stays same) is 71794 GB.
The maximum possible increase per year could be 3590 GB.
-
Index usage
Posted on July 5th, 2007 No commentsWhat do you think? Will an index be used in these cases?
1. Table returns 2,500,000 (2.5 million) records from a table having 10,000,000 rows (10 million).
2. You use a value !=’45′ condition in your where clause. ( Value column is indexed)
3. You say Value=’%ABC’ condition in where clause (value column is indexed)
4. You say Value=’ABC%’ condition in where clause (value column is indexed)
5. Where Value IS NULL (value column is indexed)
6. Where Value IS NOT NULL (value column is indexed)
In all these cases, do you think the index will be used or not?
-
Oracle analytic functions
Posted on June 28th, 2007 No commentsI recently purchased O’Reilly’s SQL cookbook. Its good. I came to know about the Oracle Window functions (analytic functions).
Some examples are
1. row_number() over()
2. dense_rank() over()
3. max() over()
4. Sum() over()
5. Lead() over()
6. Lag() over()See here for a detailed explanation.
Now, you have the trade_val column. But you want a running count of the same. How do you achieve it? See the query below. I have used the sum() over() function to get the sum. But hey! Its wrong. Its giving me the entire total and not the running total.
Yes, since I didnt specify anything in the over() section, it gave an entire total. So now I say order by stk_nam_key, date_key so that the running count is done as per the original order. See results below.
But hey! Again there is something different I need. I need the running count for each stk_nam_key and not for the entire table. See in the upper query, I have indicated with red arrow. When the stk_nam_key changed to 2, it gave me the running count. But I want to start fresh for every new grouping.So what do I do? I say over(partition by stk_nam_key order by stk_nam_key,date_key). Check the results now.
Dont get confused by the first value for the new grouping 3. The date_key for both 3′s are same and hence the sum is same for both.





