Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Sazabi
Creator
Creator

Calculating the hours ago from the max timestamp per object?

Salutations Qlik Compadres.

I am looking to return how many hours ago a row is from the max timestamp per object.

 

For example:

+---------------------+--------+----------+
| Timestamp           | Object | HoursAgo |
+---------------------+--------+----------+
| 2019-02-01 10:00:00 | A      | 0        |
+---------------------+--------+----------+
| 2019-02-01 09:00:00 | A      | 1        |
+---------------------+--------+----------+
| 2019-02-01 08:00:00 | A      | 2        |
+---------------------+--------+----------+
| 2019-02-01 10:00:00 | B      | 0        |
+---------------------+--------+----------+
| 2019-02-01 09:00:00 | B      | 1        |
+---------------------+--------+----------+
| 2019-02-01 08:00:00 | B      | 2        |
+---------------------+--------+----------+
| 2019-01-01 14:00:00 | C      | 0        |
+---------------------+--------+----------+
| 2019-01-01 13:00:00 | C      | 1        |
+---------------------+--------+----------+
| 2019-01-01 12:00:00 | C      | 2        |
+---------------------+--------+----------+  

 Object A's latest timestamp is  [2019-02-01 10:00:00] which is 0 hours ago.

The timestamp after that is [2019-02-01 09:00:00] which is 1 hour ago, etc.

 

Object C's latest timestamp is [2019-01-01 14:00:00], but the HoursAgo value is 0. This is because that timestamp is the latest for that object.

 

Thoughts on how I can implement this?

I tried loading in my fact tables, finding the max timestamp per object, then calculating out the HoursAgo value in the data load editor, but it takes an hour to do.

 

Anyone know a better way? 

 

Thanks!!

 

S

8 Replies
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

It should be relatively quick to compute it in the script. How many rows are in your fact table?  Can you post the script you tried?

-Rob

Sazabi
Creator
Creator
Author

Hi Rob!

We are pulling in 30 QVD files (1 per day), resulting in 11.6 million rows of data in our fact table.

 

What we do in the script is:

1. Loop through and load each individual QVD file (optimized load)

2. After everything is loaded, 

-- find the max timestamp per object and join it back to fact
MaxFinderByObject: Load Object, max(timestamp) as maxTimestamp Resident FactTable group by Object; join Load * Resident FactTable;
-- does the calculation to find the hours ago from max timestamp per object. FactTable_HoursAgo_Calculated: load interval(maxTimestamp - timestamp, 'h') as HoursAgo, * Resident MaxFinderByObject; drop table MaxFinderByObject, FactTable; drop field maxTimestamp;

This process takes 1 hour.

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

A few more questions:

1. How much time of the hour is taken by the MaxFinderByObject step?

2. I'm guessing you have a lot of Object values, about how many?

3. What version of Qlik?

-Rob

Sazabi
Creator
Creator
Author

Hi Rob,

 

1. The MaxFinderByObject step takes 1 hour. The entire reload takes 1 hour and 30 minutes.

2. I have ~16,000 Objects

3. We are currently on version April 2018

 

Thanks,

 

S

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Group by when you have a lot of values can be a killer. Some things you might test to see if they perform better. 

1. Sort the fact table by Object before the group by. Or add an "order by Object" to your current MaxFinderByObject load.

2.  Instead of max() and group by, use sorting:

FactTable2:
Load *,
  if(Object = Previous(Object), peek('MaxTimestamp'), timestamp) as MaxTimestamp
Resident FactTable
Order by Object, timestamp DESC;

Drop Table FactTable;

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

Sazabi
Creator
Creator
Author

Hi Rob, 

Thank you for the response 🙂

I agree that the group by + resident load seems to be the killer here 😞

I tried your latest recommended method and the reload time took 1 hour and 58 minutes, with most of the time being in the group by + resident.

 

I'm thinking of incrementally creating a QVD file per day containing the max(timestamp) per object per day. Then, I would load those files into the app and determine the max(timestamp) per object. Then, I can leave the table containing the object and the most max(timestamp) in as a reference table. (or even applymap!)

 

In my mind, this process should be much quicker because we are applying the group by + resident load and sorting on a much smaller sample size. 

 

What I don't like is the added complexity...

 

Thoughts?

 

S

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

"I tried your latest recommended method and the reload time took 1 hour and 58 minutes, with most of the time being in the group by + resident."

I don't understand. My recommendation #2 did not use a group by. Did you try #2?

-Rob

Sazabi
Creator
Creator
Author

Oh sorry,

 

I typed into the forum post "group by", but I really put "order by" in the load script

 

S