Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have the following two tables I need to link together (sample data is provided in attatched xls-file):
Every row in the Transaction table where Source = FPER corresponds to two or more rows in the Sales Details table, ie:
The problem is finding/defining a unique key to match the two tables together.
The fields Region, Store, Item & Source are all ok, but the Date field sometimes differs a few days between the tables.
Therefore I would like to add a SubTotal field to the Sales Details table, making it look like this:
Having a SubTotal would allow to construct a unique composite key, giving the desired result:
So the question is for a clever way to add this SubTotal field to the load script..
Many thanks!
In that case, use the code above for Transactions table key and for the second table ..
[Sales_Details]:
Load * Inline [
Sales_Date, Sales_Region, Sales_Store, Sales_Item, Sales_Customer, Sales_Amount, Sales_Source
2019-05-27, 2321, 56810, 5021, Donald Duck, 3641, FPER
2019-05-27, 2321, 56810, 5021, Mickey Mouse, 1324, FPER
2019-05-27, 2321, 56810, 5521, Mickey Mouse, 222, FPER
2019-05-27, 2321, 56810, 5531, Donald Duck, 5098, FPER
2019-05-27, 2321, 56810, 5531, Mickey Mouse, 764, FPER
2019-04-26, 2321, 56810, 5021, Tony Soprano, 2979, FPER
2019-04-26, 2321, 56810, 5021, Tom Hanks, 2317, FPER
2019-04-26, 2321, 56810, 5521, Tony Soprano, 444, FPER
2019-04-26, 2321, 56810, 5521, Tom Hanks, 196, FPER
2019-04-26, 2321, 56810, 5531, Tony Soprano, 1905, FPER
2019-04-26, 2321, 56810, 5531, Tom Hanks, 764, FPER
2019-03-27, 3610, 47200, 5021, Tony Soprano, 42800, FPER
2019-03-27, 3610, 47200, 5021, Bruce Willis, 39550, FPER
2019-03-27, 3610, 47200, 5021, Donald Duck, 23460, FPER
2019-03-27, 3610, 47200, 5021, Anthony Eden, 22807, FPER
2019-03-27, 3610, 47200, 5021, Lill-Babs Svensson, 19500, FPER
2019-03-27, 3610, 47200, 5122, Anthony Eden, 1623, FPER
2019-03-27, 3610, 47200, 5131, Anthony Eden, -983, FPER
2019-03-27, 3610, 47200, 5121, Anthony Eden, -1623, FPER
];
Left Join(Sales_Details)
Load Sales_Date, Sales_Region, Sales_Store, Sales_Item, sum(Sales_Amount) as KeySales
Resident Sales_Details
Group By Sales_Date, Sales_Region, Sales_Store, Sales_Item;
NewSales_Details:
Load *,
Year(Sales_Date) & '|' & Month(Sales_Date) & '|' & Sales_Region & '|' & Sales_Store & '|' & Sales_Item & '|' & Sales_Source & '|' & KeySales as SalesKey
Resident [Sales_Details];
Drop table Sales_Details;
The key will not be perfect but unfortunatly that's how my data set looks like. The real data is not for sales but to tie salary details to financial records, ie. to provide drill down capability to a P&L statement.
The Details table holds only one record per day for every region/store/item/source entry, so that should be ok.
If the date fields in both tables would be in sync the key could simply have been date/region/store/item/source.
Problem is that the date field in the Details table may differ one or more days (but always the same year and month) from the corresponding record in the Transaction table. But the SubTotals of the Details table will always sum up to the corresponding records in the Transaction table, hence the desire to use SubTotal to construct a unique key.
In that case, use the code above for Transactions table key and for the second table ..
[Sales_Details]:
Load * Inline [
Sales_Date, Sales_Region, Sales_Store, Sales_Item, Sales_Customer, Sales_Amount, Sales_Source
2019-05-27, 2321, 56810, 5021, Donald Duck, 3641, FPER
2019-05-27, 2321, 56810, 5021, Mickey Mouse, 1324, FPER
2019-05-27, 2321, 56810, 5521, Mickey Mouse, 222, FPER
2019-05-27, 2321, 56810, 5531, Donald Duck, 5098, FPER
2019-05-27, 2321, 56810, 5531, Mickey Mouse, 764, FPER
2019-04-26, 2321, 56810, 5021, Tony Soprano, 2979, FPER
2019-04-26, 2321, 56810, 5021, Tom Hanks, 2317, FPER
2019-04-26, 2321, 56810, 5521, Tony Soprano, 444, FPER
2019-04-26, 2321, 56810, 5521, Tom Hanks, 196, FPER
2019-04-26, 2321, 56810, 5531, Tony Soprano, 1905, FPER
2019-04-26, 2321, 56810, 5531, Tom Hanks, 764, FPER
2019-03-27, 3610, 47200, 5021, Tony Soprano, 42800, FPER
2019-03-27, 3610, 47200, 5021, Bruce Willis, 39550, FPER
2019-03-27, 3610, 47200, 5021, Donald Duck, 23460, FPER
2019-03-27, 3610, 47200, 5021, Anthony Eden, 22807, FPER
2019-03-27, 3610, 47200, 5021, Lill-Babs Svensson, 19500, FPER
2019-03-27, 3610, 47200, 5122, Anthony Eden, 1623, FPER
2019-03-27, 3610, 47200, 5131, Anthony Eden, -983, FPER
2019-03-27, 3610, 47200, 5121, Anthony Eden, -1623, FPER
];
Left Join(Sales_Details)
Load Sales_Date, Sales_Region, Sales_Store, Sales_Item, sum(Sales_Amount) as KeySales
Resident Sales_Details
Group By Sales_Date, Sales_Region, Sales_Store, Sales_Item;
NewSales_Details:
Load *,
Year(Sales_Date) & '|' & Month(Sales_Date) & '|' & Sales_Region & '|' & Sales_Store & '|' & Sales_Item & '|' & Sales_Source & '|' & KeySales as SalesKey
Resident [Sales_Details];
Drop table Sales_Details;
Lovely, many thanks!