Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Community!
I have the following two tables
Main Table
Key - Unique | Primary Date |
AAA | 01/24/2019 |
BBB | 04/19/2019 |
Second Table
Key - Non Unique | Secondary Date |
AAA | 12/23/2018 |
AAA | 01/20/2019 |
BBB | 01/30/2019 |
AAA | 03/20/2019 |
BBB | 05/20/2019 |
In my load script I am trying to add another column onto Main Table that is informed by the Secondary Table in the following way :-
Displays '1' if the Key in the Secondary Table has a date within a certain range (e.g. in the same month) of the primary date of the Key in the Primary Table. Displays '0' otherwise.
In the above example, AAA would have a '1' as Secondary Date 01/20/2019 falls one week around Primary Date 01/24/2019. BBB would have a '0'.
How would this be possible?
One week in either direction or only one week before the Secondary Date?
Try this
MainTable:
LOAD * INLINE [
Key, Primary Date
AAA, 01/24/2019
BBB, 04/19/2019
];
tmpSecondTable:
LOAD * INLINE [
Key, Secondary Date
AAA, 12/23/2018
AAA, 01/20/2019
BBB, 01/30/2019
AAA, 03/20/2019
BBB, 05/20/2019
];
SecondTable:
LOAD Key,
Date([Secondary Date] - 7) as StartDate,
Date([Secondary Date] + 7) as EndDate,
If(Key = Peek('Key'), RangeSum(Peek('SNo'), 1), 1) as SNo
Resident tmpSecondTable
Order By Key, [Secondary Date];
DROP Table tmpSecondTable;
Left Join (MainTable)
IntervalMatch([Primary Date], Key)
LOAD StartDate,
EndDate,
Key
Resident SecondTable;
Left Join (MainTable)
LOAD *
Resident SecondTable;
DROP Table SecondTable;
@sunny_talwar , I believe removing the synthetic keys might with the join might lead to potential issues of duplicated values. If there is more than one interval matching the Primary Date, the join will duplicate any value that might be associated to the keys (https://community.qlik.com/t5/Qlik-Design-Blog/IntervalMatch/ba-p/1464547)
I think a safer approach is to leave the synthetic keys with something like this; I've commented part of your code and added another matching interval to illustrate the potential problem:
MainTable:
LOAD * INLINE [
Key, Primary Date, Value
AAA, 24/01/2019, 5
BBB, 19/04/2019, 8
];
tmpSecondTable:
LOAD * INLINE [
Key, Secondary Date
AAA, 23/12/2018
AAA, 20/01/2019
AAA, 22/01/2019
BBB, 30/01/2019
AAA, 20/03/2019
BBB, 20/05/2019
];
SecondTable:
LOAD Key,
Date([Secondary Date] - 7) as StartDate,
Date([Secondary Date] + 7) as EndDate,
If(Key = Peek('Key'), RangeSum(Peek('SNo'), 1), 1) as SNo
Resident tmpSecondTable
Order By Key, [Secondary Date];
DROP Table tmpSecondTable;
MatchTable:
//Left Join (MainTable)
IntervalMatch([Primary Date], Key)
LOAD StartDate,
EndDate,
Key
Resident SecondTable;
//Left Join (MainTable)
//LOAD *
//Resident SecondTable;
//
//DROP Table SecondTable;
Running your script in QlikView... gave me this
This doesn't look right...
And I think as long as Main Table has a Unique Key... I don't see any issue with joining....
Sample qvw attached
Running your script (amended to match two periods, which is a realistic scenario) an extra record is inserted in the original table; you can see this in the screenshot below where I've sum by the value field. As you can see the total sum for AAA is 10, while in the original table was 5 (i.e. there is an extra row created by the join)
If you leave the synthetic key then you don't have this issue since you are not joining thus not creating the extra record; the value for AAA is still 5 as it should be
---------------------------------------------------------------
For clarity, below your original script with the added Value field and extra date in the second table
MainTable:
LOAD * INLINE [
Key, Primary Date, Value
AAA, 24/01/2019, 5
BBB, 19/04/2019, 8
];
tmpSecondTable:
LOAD * INLINE [
Key, Secondary Date
AAA, 23/12/2018
AAA, 20/01/2019
AAA, 22/01/2019
BBB, 30/01/2019
AAA, 20/03/2019
BBB, 20/05/2019
];
SecondTable:
LOAD Key,
Date([Secondary Date] - 7) as StartDate,
Date([Secondary Date] + 7) as EndDate,
If(Key = Peek('Key'), RangeSum(Peek('SNo'), 1), 1) as SNo
Resident tmpSecondTable
Order By Key, [Secondary Date];
DROP Table tmpSecondTable;
Left Join (MainTable)
IntervalMatch([Primary Date], Key)
LOAD StartDate,
EndDate,
Key
Resident SecondTable;
Left Join (MainTable)
LOAD *
Resident SecondTable;
DROP Table SecondTable;
--------------------------------------
Below the amended script without the join
MainTable:
LOAD * INLINE [
Key, Primary Date, Value
AAA, 24/01/2019, 5
BBB, 19/04/2019, 8
];
tmpSecondTable:
LOAD * INLINE [
Key, Secondary Date
AAA, 23/12/2018
AAA, 20/01/2019
AAA, 22/01/2019
BBB, 30/01/2019
AAA, 20/03/2019
BBB, 20/05/2019
];
SecondTable:
LOAD Key,
Date([Secondary Date] - 7) as StartDate,
Date([Secondary Date] + 7) as EndDate,
If(Key = Peek('Key'), RangeSum(Peek('SNo'), 1), 1) as SNo
Resident tmpSecondTable
Order By Key, [Secondary Date];
DROP Table tmpSecondTable;
MatchTable:
IntervalMatch([Primary Date], Key)
LOAD StartDate,
EndDate,
Key
Resident SecondTable;
I can't open your qvw at the moment
I see what you mean. I guess there are way around it, but to keep it simple, synthetic key might be a better way.