Qlik Community

Qlik Sense App Development

Discussion board where members can learn more about Qlik Sense App Development and Usage.

Announcements
Coming your way, the Qlik Data Revolution Virtual Summit. October 27-29. REGISTER
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Contributor
Contributor

Calculation in load script

Hi!

 

I've imported data from a table that gives feedback on emails, like was it opened, clicked,... .

This generates a lot of data. I currently have everything starting from 20180101 up untill today loaded. This are +- 190Million records. I need all of these as year-to-year analysis should be possible. For future use in other projects, I also need to import every record.

I build some measures that I need in a sheet. However, this takes a long time to calculate for all these records. 

I want to transform this to perfom te calculations (aggregated) in the load scipt.

Here's what I have now:

 

TRANS_FLAGS_UNIQUE:

LOAD 
     Min(Date(ACTIONDATE,'DD/MM/YYYY')) as Date,
     CAMPAIGNID,
     ACTIONID,
     PROBEID,
     USERID,
     if(match(mid(PROBEID,1,3),'-1_','-10'),'Yes','No') 		as Unique_Opened,
     if(PROBEID >'0' AND PROBEID <> '0_-_MKT' AND PROBEID<>'0_-_SIM' AND PROBEID<>'0_-_COMM','Yes','No') as Unique_Clicked,
     if(match(mid(PROBEID,1,3),'9_-','500'),'Yes','No') as Unique_Optout
FROM [lib://Manual Data Shared Folder/rdeconin/CRM RMD Campaign Analysis/QVD/RAW_F_RAWFLAGS.qvd]
(qvd)
group by CAMPAIGNID, ACTIONID, PROBEID, USERID;

How can I calculate the sum of all opened, clicked, etc in the load script?

 

Thanks!

Labels (3)
1 Reply
Highlighted
MVP & Luminary
MVP & Luminary

I don't think that I would try to solve this task by aggregating the data within the script. An aggregation load over 190 M records over multiple grouping fields is a very heavy transformation and must need some time and I assume that with all these grouping fields you won't get a very high compression-rate - probably something like 1:5 or less which means that there are further around 40 M records in the table.

I suggest that you load these data without any aggregation - only performing the needed flagging and splitting dates and times in separate fields (I assume that your Date field is really a timestamp - and it may be already done one step before by creating this qvd).

Maybe you need some further measures for all your requirements but if you could apply in the end rather simple expression like: sum/count({ set analysis } field) it should work quite smoothly even with 190 M records. As far as you apply (nested) if-loops and/or aggr() it will be probably too slow (whereby it won't be really much better if it are only 40 M of records).

- Marcus