Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
umerikhlas
Contributor III
Contributor III

Having trouble with Slowly Changing Dimension with Interval Match()

Dear All community members,

I am trying to implement SCD here but unable to get the validated figures. Please have a look on the attached data model and suggest the way out.

Colin_Albert‌ kindly have a look please!

Kind Regards,

UI

3 Replies
hic
Former Employee
Former Employee

First of all, you need to establish the link between a sales transaction and a department while respecting the date intervals. As it is now, you don't use the date information.

Further, if each sales transaction "belongs to" a department, and each department has several employees, I don't see how you can establish a link between a specific sales transaction and a specific employee. You need to have employee information in the sales table for this.

See also Slowly Changing Dimensions

HIC

umerikhlas
Contributor III
Contributor III
Author

Yes hic‌. You are right and that is where I am having trouble.

  • Employee does not have the direct relationship with Sales table.
  • Department history records the information for employees with time intervals.
  • Sales order is associated with department table, and it has a sales date.
  • I am confused that how can we get to know when the sales was occured, at that time which employee was associated with the particular department?
  • I followed the steps and tried to implement interval match with join, but since sales table is not directly associated to employee therefore I found some difficulties.

Kindly share your ideas in the light of your experience.

Thanks

UI

hic
Former Employee
Former Employee

This is a classical example of a fan trap, i.e. a model where you link over a table with lower cardinality, so that you cannot resolve the relationships completely. In your case, each sales transaction will be linked to several employees, since the link goes over the DepartmentHistory table.

Use the following script and you will get the following data model.

HIC

Image1.png

Departments:
LOAD DepartmentID,
DepartmentName
FROM Departments.txt (...);

Sales:
LOAD DepartmentID,
SalesDate & '|' & DepartmentID   as DepartmentDateKey,
SalesID,
SalesDate,
SalesAmount
FROM Sales.txt (...);

DepartmentHistory:
LOAD DepartmentHistoryID,
FromDate & '|' & ToDate & '|' & DepartmentID as DepartmentIntervalKey,
EmployeeID,
DepartmentID,
FromDate,
ToDate
FROM DepartmentHistory.txt (...);

Employee:
LOAD EmployeeID,
EmployeeName
FROM Employee.txt (...);

tmpIntervalMatch:
IntervalMatch (SalesDate, DepartmentID)
Load FromDate, ToDate, DepartmentID   
Resident DepartmentHistory;

IntervalMatch:
Load
SalesDate & '|' & DepartmentID   as DepartmentDateKey,
FromDate & '|' & ToDate & '|' & DepartmentID as DepartmentIntervalKey
Resident tmpIntervalMatch;

Drop Table tmpIntervalMatch;
Drop Field DepartmentID From Sales;