-
Logic behind implementing SCD 2
Posted on October 7th, 2008 No commentsLets talk about the logic behind SCD type 2 today.
We know that SCD 2 is about preserving all the changes in the dimension records. Let us see the logic behind how we can implement it.
NOTE: The steps below assume SCD 2 having a Begin_date and End_date and a Current_Flag column
- Check if the incoming row is already present in the target table (dimension) using the source primary key
- If it doesn’t exist in the target dimension
- Generate a surrogate key
- Enter source record’s date as the Begin_date
- Enter the default end date (which could be 31/12/2099) as the End_date
- If you have a Current_flag column, set it as ‘Y’ or ’1′ (or whatever you want)
- Insert into the dimension
- If the row exists in the target
- Check if the incoming and target current record are different (at least for one chosen attribute)
- If they are same, do nothing
- If they are different, do the following
- For the record in the target table, change (update) the End_date to source record’s date and set the Current_flag to ‘N’ or ’0′ or whatever
- Take the incoming record, generate a surrogate key, enter source record’s date as the Begin_date and the default date as the End_date. Also set the Current_flag to ‘Y’ or ’1′. Insert into the dimension
NOTE: The End_date of the previous record and the Begin_date of the current record are assumed to be the same. But some people prefer them to be different dates ie) End_date is 1 day lesser than the next records Current_date.
Data Integration, Data Modelling, Data warehouse, Dimensional modelling, EIM, ETL Data Integration, Data Modelling, Dimensional modelling, DWH, EIM, ETL, SCDLeave a reply
You must be logged in to post a comment.


