Advanced SCD Type 2 Implementation in SSIS
In SSIS, the Slowly Changing Dimension (SCD) component is a useful tool for transforming dimension tables with SCD Type 1 and Type 2 columns. However, its default functionalities are limited. For instance, it requires choosing between an 'active' column or start date and end date to identify current and expired records. Additionally, it assumes that a NULL value in the end date indicates an active record.
This article presents steps to modify the generated
components by the SSIS Slowly Changing Dimension Wizard to achieve more complex
SCD Type 2 implementations. In this example, we aim to accomplish the following
objectives:
1.
Use both an active_flag and start_date/end_date
to indicate active records.
2.
Utilize a magical date (9999-12-31) for the
end_date of new records – this is easier for querying.
3.
Employ the current date (System::StartTime) as
the start_date for new records and as the end_date for SCD Type 2 columns when
they change.
Solution
The Slowly Changing Dimension component offers three
outputs: Historical Attribute Inserts Output, New Output, and Changing
Attribute Updates Output. We will modify the following components under the
Attribute Inserts Output and New Output to achieve our goals.
·
Under “Historical
Attribute Inserts Output”
o
Derived Column
§
Add start_date = @[System::StartTime]
o
OLE DB Command
§
Modify SqlCommand in Component Properties as
follows
UPDATE [dbo].[dimension_table] SET [active_flag]
= ?, end_date = ? WHERE [business_key] = ? AND [active_flag] = '1'
§
In Column Mappings, add start_date to Param_1
and change business_key to Param_2
·
Under “New
Output”
o
Add a “Derived Column 2” with the following
column:
§ start_date = @[System::StartTime]
·
In “Union
All”
o
Add start_date that includes start_date from
both outputs
·
In “Derived
Column 1”
o Add end_date = “9999-12-31”
·
In “Insert
Destination
o In Mappings, map start_date and end_date
Conclusion
Comments
Post a Comment