Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Just curious if there are others running into the same situation.
QlikTech has given us Concatenate, Inner, Outer, Left, and Right, but I keep running into a situation where I need, shall we say a Merge join? A merge join would combine columns on the same row rather than appending them AND instead of using nulls in numeric columns, it would use zeros so that your SUM functions still work and so that you can to intra-row calculations on the resulting table.
Table 1:
Item | Sales |
---|---|
Cars | $100 |
Trucks | $150 |
Vans | $145 |
Table 2:
Item | Costs |
---|---|
Cars | $80 |
Trucks | $100 |
Bicycles | $25 |
A merge join, would then result in the following:
Resulting Table :
Item | Sales | Costs |
---|---|---|
Cars | $100 | $80 |
Trucks | $150 | $100 |
Vans | $145 | 0 |
Bicycles | 0 | $25 |
Now, if I want to calculate Margin in the load script, I can just perform a Sales - Cost without having to worry that the records are not aligned or without having to write all kinds of isnull tests in the expressions just to get it to work right.
I know I can load table 1 and table 2 into a temporary table and then perform a group by on the temporary table to generate the final results table you see here, but the problem is that the Group By requires another pass over the data and it also eats up too much memory (see http://community.qlik.com/message/129484#129484).
This would sure help with the kinds of queries I keep running into. Anyone else need a Merge join (or have I just totally missed the boat and there is a way to do this)?
The process of doing your 'Merge Join' is as follows. If you need to do any 'Group By's, you may do them either at the time of qvd generation or loading data from qvd. I think memory may not be a constraint for this process. Will this work for you?
// Load Sales data from your source database or qvd file
Sales:
LOAD * Inline [
Item, Sales
Car,100
Truck,150
Van,145
];
// Load Costs data from your source database or qvd file
Costs:
LOAD * Inline [
Item, Costs
Car,80
Truck,100
Bicycle,25
];
//Get a list of Item field values from both tables
Result1:
LOAD Distinct Item as Item1 Resident Sales;
Concatenate (Result1) LOAD Distinct Item as Item1 Resident Costs;
//Add Sales data and drop Sales table
Left Join (Result1) LOAD Item as Item1, Sales Resident Sales;
Drop Tables Sales;
//Add Costs data and drop Costs table
Left Join (Result1) LOAD Item as Item1, Costs Resident Costs;
DROP Table Costs;
//Replace null with zero if necessary - this may not be necessary!
Result:
NoConcatenate
LOAD Item1 as Item, If(Len(Sales)>0,Sales,0) as Sales, If(len(Costs)>0,Costs,0) as Costs Resident Result1;
DROP Table Result1;
Hi,
As blackhawk mentioned Qliktech should come up with some special join to meet this requirement.
Also as krishnamoorthy mentioned you need not have any Group by for achieving the above requirement, below is the simpler version of krishnamoorthy's suggested solution.
// Load Sales data from your source database or qvd file
Sales:
LOAD * Inline [
Item, Sales
Car,100
Truck,150
Van,145
];
// Load Costs data from your source database or qvd file
Outer Join (Sales)
LOAD * Inline [
Item, Costs
Car,80
Truck,100
Bicycle,25
];
Final:
Noconcatenate Load Item, If(Len(Sales)>0,Sales,0) as Sales, If(len(Costs)>0,Costs,0) as Costs Resident Sales;
Drop Table Sales;
Thanks for your assistance krishnamoorthy and BlackRockS. I appreciate your input greatly.
For standard table merges, I pretty much use what BlackRockS had suggested. What I failed to mention in my original posting was that it should also perform an additive operation (or perhaps with an "Accumulate" option) on each row with common key fields so that you can go from detail records to some aggregate level without having to sum everything.
Thus, what I should have stated was that:
Table 1:
Item | Month | Sales |
---|---|---|
Car | Jan | 100 |
Truck | Jan | 150 |
Van | Jan | 45 |
Van | Feb | 100 |
Table 2:
Item | Cost |
---|---|
Car | 80 |
Truck | 100 |
Bicycle | 25 |
with this new Merge join should now result in the following:
Item | Sales | Cost |
---|---|---|
Car | 100 | 80 |
Truck | 150 | 100 |
Van | 145 | 0 |
Bicycle | 0 | 25 |
And the script would look something like
// Load Sales data from your source database or qvd file
Sales:
LOAD Item, Sales
Inline [
Item, Month, Sales
Car, Jan, 100
Truck, Jan, 150
Van, Jan, 100
Van, Feb, 45
];
// Load Costs data from your source database or qvd file
Merge Join (Sales)
LOAD *
Inline [
Item, Costs
Car,80
Truck,100
Bicycle,25
];
Now this would be a VERY welcome feature!!