Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Salutations Qlik Compadres.
I am looking to return how many hours ago a row is from the max timestamp per object.
For example:
+---------------------+--------+----------+ | Timestamp | Object | HoursAgo | +---------------------+--------+----------+ | 2019-02-01 10:00:00 | A | 0 | +---------------------+--------+----------+ | 2019-02-01 09:00:00 | A | 1 | +---------------------+--------+----------+ | 2019-02-01 08:00:00 | A | 2 | +---------------------+--------+----------+ | 2019-02-01 10:00:00 | B | 0 | +---------------------+--------+----------+ | 2019-02-01 09:00:00 | B | 1 | +---------------------+--------+----------+ | 2019-02-01 08:00:00 | B | 2 | +---------------------+--------+----------+ | 2019-01-01 14:00:00 | C | 0 | +---------------------+--------+----------+ | 2019-01-01 13:00:00 | C | 1 | +---------------------+--------+----------+ | 2019-01-01 12:00:00 | C | 2 | +---------------------+--------+----------+
Object A's latest timestamp is [2019-02-01 10:00:00] which is 0 hours ago.
The timestamp after that is [2019-02-01 09:00:00] which is 1 hour ago, etc.
Object C's latest timestamp is [2019-01-01 14:00:00], but the HoursAgo value is 0. This is because that timestamp is the latest for that object.
Thoughts on how I can implement this?
I tried loading in my fact tables, finding the max timestamp per object, then calculating out the HoursAgo value in the data load editor, but it takes an hour to do.
Anyone know a better way?
Thanks!!
S
It should be relatively quick to compute it in the script. How many rows are in your fact table? Can you post the script you tried?
-Rob
Hi Rob!
We are pulling in 30 QVD files (1 per day), resulting in 11.6 million rows of data in our fact table.
What we do in the script is:
1. Loop through and load each individual QVD file (optimized load)
2. After everything is loaded,
-- find the max timestamp per object and join it back to fact
MaxFinderByObject: Load Object, max(timestamp) as maxTimestamp Resident FactTable group by Object; join Load * Resident FactTable;
-- does the calculation to find the hours ago from max timestamp per object. FactTable_HoursAgo_Calculated: load interval(maxTimestamp - timestamp, 'h') as HoursAgo, * Resident MaxFinderByObject; drop table MaxFinderByObject, FactTable; drop field maxTimestamp;
This process takes 1 hour.
A few more questions:
1. How much time of the hour is taken by the MaxFinderByObject step?
2. I'm guessing you have a lot of Object values, about how many?
3. What version of Qlik?
-Rob
Hi Rob,
1. The MaxFinderByObject step takes 1 hour. The entire reload takes 1 hour and 30 minutes.
2. I have ~16,000 Objects
3. We are currently on version April 2018
Thanks,
S
Group by when you have a lot of values can be a killer. Some things you might test to see if they perform better.
1. Sort the fact table by Object before the group by. Or add an "order by Object" to your current MaxFinderByObject load.
2. Instead of max() and group by, use sorting:
FactTable2:
Load *,
if(Object = Previous(Object), peek('MaxTimestamp'), timestamp) as MaxTimestamp
Resident FactTable
Order by Object, timestamp DESC;
Drop Table FactTable;
-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com
Hi Rob,
Thank you for the response 🙂
I agree that the group by + resident load seems to be the killer here 😞
I tried your latest recommended method and the reload time took 1 hour and 58 minutes, with most of the time being in the group by + resident.
I'm thinking of incrementally creating a QVD file per day containing the max(timestamp) per object per day. Then, I would load those files into the app and determine the max(timestamp) per object. Then, I can leave the table containing the object and the most max(timestamp) in as a reference table. (or even applymap!)
In my mind, this process should be much quicker because we are applying the group by + resident load and sorting on a much smaller sample size.
What I don't like is the added complexity...
Thoughts?
S
"I tried your latest recommended method and the reload time took 1 hour and 58 minutes, with most of the time being in the group by + resident."
I don't understand. My recommendation #2 did not use a group by. Did you try #2?
-Rob
Oh sorry,
I typed into the forum post "group by", but I really put "order by" in the load script
S