-
Tuning: Min and Max in same query
Posted on July 12th, 2008 No commentsPlease read this article.
Karen explains why having min and max in the same query impacts performance and how to tackle the same.
-
Migration of a DW solution from Oracle to DB2
Posted on April 26th, 2008 No commentsRecently we migrated an entire DW module from Oracle to DB2. And our ETL was done using Business Objects Data Integrator.
Below is a sequence of steps that we followed to migrate the entire solution (including data structures, ETL code and universe)
Assumption:
–>All estimates done for 4 DB2 databases, 3 federated systems, 62 tables, 33 indexes, 6 DI jobs, 110 data flows, 151 transformations, 16 SQL transforms, 34 SQL() scripts, 220 objects in 23 classes and 2 reports.
–>Add 20% time as buffer for unexpected issues.
–>Some tasks can be executed in parallel, while some have strong dependency on completion of previous tasks.Work breakdown of migration and rough estimation:
1. Initiation & planning : 40 Hrs
Establish objectives and goals
Requirements gathering
Establish scope
Plan resources
Identify development environment2. Assessment : 40 Hrs
Assess tool and technology needs
Assess and understand technical requirements
Establish technical guidelines
Finalize tools and development environmentBODI, BODS, Business Intelligence, DB2, Data Integration, Data Modelling, Data warehouse, Database, Dimensional modelling, EIM, ETL, Metadata, Oracle, SAP BODI, BODS, Business objects, Data Integration, Data Modelling, Data Quality, Database, DB2, Dimensional modelling, DWH, Effort estimation, EIM, ETL, Metadata, Oracle, SAP-BO -
NULL and NOT IN (Oracle)
Posted on March 26th, 2008 No commentsHere is a very interesting scenario.
I have an employee table called EMP.
And I have a DEPT table.
Now, I write a query to find out all the employees who are part of any department. (DEPT_ID is properly updated and found in DEPT table)
Now, I want to know if any employee is not part of any department.
What??? No results? I know that employee with EMP_ID 9 doesn’t belong to any department. Where did he go?
Is Oracle drunk? Is it giving me wrong results? Wait let me check with another query.
-
Truncate and delete
Posted on July 25th, 2007 No commentsTruncate:
1. Moves the high water mark down to beginning.
2. Doesnt need commit. It is a DDL command.
3. No condition needed. Removes all rows.
4. Extents are deallocated. (If you specify the reuse storage clause, then the extents are not deallocated)
5. No rollback.Delete:
1. Doesnt change the high water mark.
2. Needs commit. It’s a DML command.
3. Can specify a condition and remove specific rows.
4. Extents are not deallocated.
5. Changes can be rolled back. -
Why index doesnt access null values?
Posted on July 17th, 2007 No commentsOracle indexes do not contain entries that point to rows with null values on all the indexed columns.
Therefore any query that has a column IS NULL or IS NOT NULL will not use the index.
-
Index usage
Posted on July 5th, 2007 No commentsWhat do you think? Will an index be used in these cases?
1. Table returns 2,500,000 (2.5 million) records from a table having 10,000,000 rows (10 million).
2. You use a value !=’45′ condition in your where clause. ( Value column is indexed)
3. You say Value=’%ABC’ condition in where clause (value column is indexed)
4. You say Value=’ABC%’ condition in where clause (value column is indexed)
5. Where Value IS NULL (value column is indexed)
6. Where Value IS NOT NULL (value column is indexed)
In all these cases, do you think the index will be used or not?
-
Oracle analytic functions
Posted on June 28th, 2007 No commentsI recently purchased O’Reilly’s SQL cookbook. Its good. I came to know about the Oracle Window functions (analytic functions).
Some examples are
1. row_number() over()
2. dense_rank() over()
3. max() over()
4. Sum() over()
5. Lead() over()
6. Lag() over()See here for a detailed explanation.
Now, you have the trade_val column. But you want a running count of the same. How do you achieve it? See the query below. I have used the sum() over() function to get the sum. But hey! Its wrong. Its giving me the entire total and not the running total.
Yes, since I didnt specify anything in the over() section, it gave an entire total. So now I say order by stk_nam_key, date_key so that the running count is done as per the original order. See results below.
But hey! Again there is something different I need. I need the running count for each stk_nam_key and not for the entire table. See in the upper query, I have indicated with red arrow. When the stk_nam_key changed to 2, it gave me the running count. But I want to start fresh for every new grouping.So what do I do? I say over(partition by stk_nam_key order by stk_nam_key,date_key). Check the results now.
Dont get confused by the first value for the new grouping 3. The date_key for both 3′s are same and hence the sum is same for both.
-
Transposing rows to columns
Posted on June 20th, 2007 No commentsToday, I saw a question in an orkut SQL forum where the guy wanted to transpose rows to columns.
I replied to it.
Let me explain about it.
See this snapshot. I do a query of my test2 table. There are 2 rows. And the 4 weekly data has to be extracted into single column.
See, there are 4 columns that need to be turned into 1 column in 4 rows. So we need to do a cartesian join 4 times. So I use this table below.
See the cartesian join done and the results. See that each ord_no appears 4 times.
-
Pagination
Posted on June 16th, 2007 No commentsHave you tried any query like this?
select * from time_dim where rownum between 1 and 5
What will be the answer?Okay, what will be the answer for this one?
select * from time_dim where rownum between 2 and 5Surprised? Rownum will return any row only if it starts with 1. Everything else will not return anything.
Okay, I guess you knew this part already. Then try this query.
select * from time_dim where rownum between 1 and 5 order by time_key ascDo you think you will get the first 5 ranks ordered by time_key? Try it. You didnt. You got some 5 records ordered by time_key, not necessarily the top 5.
Why? Because rownum is a pseudo column that is assigned to rows that are fetched. The first 5 records that got fetched were returned with order by on time_key.
Then how do you make sure that you get the first 5 only?
For that you need something called pagination. Try this query
select * from
(
select rownum as rank, a.* from
(
select * from time_dim order by time_key
) a
) b
where rank=8Here you passed ordered the data from the query in a sub-query. Then passed it to the outer one with the rownum and gave it an alias. Then you used it in the outer query to get that rank.
It is necessary to first order the data and then pass it to an outer query to use the rownum. If you use rownum from the same query where you ordered, it will not be ordered correctly.
-
Fetch data that fall in a rank range
Posted on June 5th, 2007 No commentsThis query returns all the rows that fall in a certain rank range. Please note that the second condition is rnum and not rownum. If you use rownum, nothing will be returned.
select * from
( select a.*, rownum rnum from
(select * from expense_fct where expense is not null order by expense desc) a
where rownum <= &upperbound )
where rnum >= &lowerbound









