Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
In an Excel-file validation dates are included to show which record should be used.
For example:
BusinessUnit ; Cluster ; Account ; Validation date from ; Validation date to
BU 1 ; Owner A ; Company F ; 01-01-2012 ; 01-01-2020
BU 2 ; Owner B ; Company G ; 01-01-2012 ; 01-01-2020
BU 2 ; Owner A ; Company F ; 09-09-2013 ; 01-01-2020
If you look at above example you can see that Account F has very recently been moved to another cluster.
The account along with forecastdata, actual (hour and volume) is gathered from textfiles.
AccountData:
LOAD Account,
[Actual Hours],
[Actual Volumes],
[Forecasted Hours],
[Forecasted Volume]
FROM $(TextFiles);
It the same way it's possible to load the BussinessUnit and Cluster into a seperate table:
BU:
LOAD BusinessUnit,
Cluster,
Account,
[Validation date from],
[Validation date to]
FROM '$(vDataFolder)BuCluster.xlsx'
Now I need help cleaning up this BU - table.
The account can only have one result in this table:
The one with the most recent [Validation date from] and with a valid [Validation date to]
Help would be very much appreciated.
Hi
I agree with tresesco, this is a slowly changing dimension. The appropriate way to handle this in QV, is to use an "extended interval match". For this to work, your account data needs a date field, to identify which BU and cluster to use for an account (as this is what changes over time).
Lets assume that AccountData has a date field, eg: Period (stored as a date)
Left Join (AccountData)
IntervalMatch (Period, Account)
LOAD [Validation date from],
[Validation date to],
Account
Resident BU;
This will create a syn table, which is OK, but if you would like to eliminate it, then:
Left Join (AccountData)
LOAD [Validation date from],
[Validation date to],
Account,
BusinessUnit,
Cluster
Resident BU;
DROP Fields [Validation date from], [Validation date to];
DROP Table BU;
Now the BusinessUnit and Cluster are correct for each [Period] in the AccountData table.
Hope that I understood your requirement and that this helps
Jonathan
I am not very sure if i have understood. May be link about Slowly changing dimension could help you.
You may try the below logic:
BU:
LOAD BusinessUnit,
Cluster,
Account,
[Validation date from],
[Validation date to]
FROM '$(vDataFolder)BuCluster.xlsx'
Right Join
LOAD BusinessUnit,
Cluster,
Account,
Max([Validation date from]) as [Validation date from]
FROM '$(vDataFolder)BuCluster.xlsx'
Where [Validation date from] <= [Validation date to]
Group By
BusinessUnit,
Cluster,
Account;
Regards,
som
Hi
I agree with tresesco, this is a slowly changing dimension. The appropriate way to handle this in QV, is to use an "extended interval match". For this to work, your account data needs a date field, to identify which BU and cluster to use for an account (as this is what changes over time).
Lets assume that AccountData has a date field, eg: Period (stored as a date)
Left Join (AccountData)
IntervalMatch (Period, Account)
LOAD [Validation date from],
[Validation date to],
Account
Resident BU;
This will create a syn table, which is OK, but if you would like to eliminate it, then:
Left Join (AccountData)
LOAD [Validation date from],
[Validation date to],
Account,
BusinessUnit,
Cluster
Resident BU;
DROP Fields [Validation date from], [Validation date to];
DROP Table BU;
Now the BusinessUnit and Cluster are correct for each [Period] in the AccountData table.
Hope that I understood your requirement and that this helps
Jonathan