Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a Qlik process with multiple steps (workspaces). The first is to import data from external sources, the second is to transform it, if necessary. I need to modify dates in the transform step in order to avoid date conflicts for a time-based index of standard values, but the issue I'm facing is that the dates to be compared are on different records. I cannot modify the process creating the data I'm importing.
My (simplified) import step pulls information where there are standard numerical values for a time period. Example:
Location, Start Date, End Date, Amount
A,01/01/2015,01/01/2015,3
A,01/01/2015,,4
B,08/14/2015,08/20/2015,5
B,08/20/2015,11/18/2015,6
B,11/18/2015,,7
C,06/01/2013,,2
The imported data has, for the same Location, the same Start Date as another record's End Date whenever there is a change in value. The current record will always have a null as the End Date. If I load the data as-is, then I'm going to have two different Amounts for a single Location on conflicting Start/End Dates (see records 3 vs 4 for 08/20/2015, and 4 vs 5 for 11/18/2015).
I cannot think of a simple way to modify my dates in the Load Script to account for these conflicts. Subtracting a day from the End Date field will render some single-day changes as if they never occurred. Adding a day to the Start Dates on records that do not have End Dates might cause issues due to records I'm joining with that have data on the day being pushed forward.
Ideally, I'd like to have the example data mirror the set below:
Location, Start Date, End Date, Amount
A,01/01/2015,01/01/2015,3
A,01/02/2015,,4
B,08/14/2015,08/20/2015,5
B,08/21/2015,11/18/2015,6
B,11/19/2015,,7
C,06/01/2013,,2
However, this is all dependent on information not contained in the record being modified, but would be dependent on if the record is an original record for a Location. This can't be determined until the rest of the dataset is examined. I've never done this in Qlik. Is there something which does this easily? If there is no simple solution, please just let me know that there is no simple solution rather than provide a complex answer. I'd rather not waste anyone's time if there isn't a simple solution.
Thank you in advance,
Greg
I think you can use the peek or previous function to get the values (location, end date) of the previous record read and change the value of the current record start date
I think you can use the peek or previous function to get the values (location, end date) of the previous record read and change the value of the current record start date
Maybe like this:
INPUT:
LOAD Location,
if(LocationRecordID = 1, [Start Date], Date(Rangesum(Peek([End Date]),1))) as [Start Date],
[End Date],
Amount;
LOAD *,
AutoNumber(recno(), Location) as LocationRecordID
INLINE [
Location, Start Date, End Date, Amount
A,01/01/2015,01/01/2015,3
A,01/01/2015,,4
B,08/14/2015,08/20/2015,5
B,08/20/2015,11/18/2015,6
B,11/18/2015,,7
C,06/01/2013,,2
];
Temp:
LOAD Location, Date#([Start Date],'MM/DD/YYYY') as [Start Date],Date#([End Date],'MM/DD/YYYY') as [End Date], Amount;
LOAD * Inline
[
Location, Start Date, End Date, Amount
A,01/01/2015,01/01/2015,3
A,01/01/2015,,4
B,08/14/2015,08/20/2015,5
B,08/20/2015,11/18/2015,6
B,11/18/2015,,7
C,06/01/2013,,2
];
Location:
NoConcatenate
LOAD Location,
If(Location = Previous(Location),Date([Start Date]+1,'MM/DD/YYYY'),[Start Date]) as [Start Date],
[End Date],
Amount
Resident Temp
Order By Location, [Start Date];
DROP Table Temp;
result:
Location | Start Date | End Date | Amount |
---|---|---|---|
A | 01/01/2015 | 01/01/2015 | 3 |
A | 01/02/2015 | 4 | |
B | 08/14/2015 | 08/20/2015 | 5 |
B | 08/21/2015 | 11/18/2015 | 6 |
B | 11/19/2015 | 7 | |
C | 06/01/2013 | 2 |
I would have to do an ORDER BY Location, Start Date, End Date prior to performing this, correct? (Does null End Date get sorted to last?)
I'd have to include a sort by End Date as well in case this is shuffled, correct? Do null End Dates get sorted to last?
Right, the input table should be sorted correctly.
Try something like this:
TempFormatDate: //This is a temporary table dropped at the end of the script. Creating this formats the two date fields
Load
Date(Start) as [Start Date],
Date(End) as [End Date];
LOAD * Inline [
Start, End
01/1/15, 01/1/15
];
Input:
Load * Inline [
Location, Start Date, End Date, Amount
A,01/01/2015,01/01/2015,3
A,01/01/2015,,4
B,08/14/2015,08/20/2015,5
B,08/20/2015,11/18/2015,6
B,11/18/2015,,7
C,06/01/2013,,2];
NoConcatenate
Output:
LOAD
Location,
if(Previous([End Date]) = [Start Date], [Start Date]+1,[Start Date]) as [Start Date],
[End Date],
Amount
Resident Input;
Drop Tables Input, TempFormatDate;
Do null End Dates get sorted to last?
If you are talking about a Resident Table LOAD in QV, I don't think so.
Could you replace NULL in [End Date] with e.g. today() or a future date?
edit:
In your case,
ORDER BY Location, [Start Date], [End Date] desc;
might be sufficient then without replacing NULL.