Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
robindc
Contributor
Contributor

Calculation in load script

Hi!

 

I've imported data from a table that gives feedback on emails, like was it opened, clicked,... .

This generates a lot of data. I currently have everything starting from 20180101 up untill today loaded. This are +- 190Million records. I need all of these as year-to-year analysis should be possible. For future use in other projects, I also need to import every record.

I build some measures that I need in a sheet. However, this takes a long time to calculate for all these records. 

I want to transform this to perfom te calculations (aggregated) in the load scipt.

Here's what I have now:

 

TRANS_FLAGS_UNIQUE:

LOAD 
     Min(Date(ACTIONDATE,'DD/MM/YYYY')) as Date,
     CAMPAIGNID,
     ACTIONID,
     PROBEID,
     USERID,
     if(match(mid(PROBEID,1,3),'-1_','-10'),'Yes','No') 		as Unique_Opened,
     if(PROBEID >'0' AND PROBEID <> '0_-_MKT' AND PROBEID<>'0_-_SIM' AND PROBEID<>'0_-_COMM','Yes','No') as Unique_Clicked,
     if(match(mid(PROBEID,1,3),'9_-','500'),'Yes','No') as Unique_Optout
FROM [lib://Manual Data Shared Folder/rdeconin/CRM RMD Campaign Analysis/QVD/RAW_F_RAWFLAGS.qvd]
(qvd)
group by CAMPAIGNID, ACTIONID, PROBEID, USERID;

How can I calculate the sum of all opened, clicked, etc in the load script?

 

Thanks!

Labels (3)
1 Reply
marcus_sommer

I don't think that I would try to solve this task by aggregating the data within the script. An aggregation load over 190 M records over multiple grouping fields is a very heavy transformation and must need some time and I assume that with all these grouping fields you won't get a very high compression-rate - probably something like 1:5 or less which means that there are further around 40 M records in the table.

I suggest that you load these data without any aggregation - only performing the needed flagging and splitting dates and times in separate fields (I assume that your Date field is really a timestamp - and it may be already done one step before by creating this qvd).

Maybe you need some further measures for all your requirements but if you could apply in the end rather simple expression like: sum/count({ set analysis } field) it should work quite smoothly even with 190 M records. As far as you apply (nested) if-loops and/or aggr() it will be probably too slow (whereby it won't be really much better if it are only 40 M of records).

- Marcus