How can I calculate a cumulative sum in load scrip... - Qlik Community

Report Inappropriate Content · ‎2012-04-02

I have a (large) dataset being loaded from text with a composite key and several metric columns. I would like to include a cumulative sum (similar to "full aggregation" in a visualization, but the full aggregation option does not allow me to perform all the calculations and interactions I need).

The general structure of the dataset is:

key1, key2, ... : identifiers for this grouping

percentile : 10 values per key 10, 20, ..., 100

metric1, metric2 : counts of interactions in each percentile block that I would like to aggregate

If I were doing this in sql, I would do it as a self-join -- ex:

select t1.key1
, t1.key2
, t1.percentile
, sum(t2.metric1) as cum_sum_metric1
, sum(t2.metric2) as cum_sum_metric2
from table t1
inner join table t2 
on t1.key1=t2.key1
and t1.key2=t2.key2
and t1.percentile >= t2.percentile
group by t1.key1
, t1.key2
, t1.percentile

Is there a way to replicate this behavior in the Qlikview load script? Alternatively, if there is another way to accomplish this end, I am open to other suggestions.

Thanks,

--Matt

Jason_Michaelides · ‎2012-04-02

I don't think you can tell QlikView how to join tables directly so you have to get a bit sneaky about it. I think something like the following would work but it's untested...

Tab1:

Load

Key1

,Key2

,Percentile

,etc

;

SQL Select * From Table1;

INNER JOIN (Tab1)

Load

Key1

,Key2

,Percentile AS Percentile_T2

,Metric1

,Metric2

;

SQL Select * From Table2;

This will join all rows where Key1 and Key2 both match.

Now you can group and sum, but judging by your query above you only want to do this where Percentile >= Percentile2, so:

LEFT JOIN (Tab1)

Load

Key1

,Key2

,Percentile

,Sum(Metric1) AS Accum1

,Sum(Metric2) AS Accum2

RESIDENT Tab1

WHERE Percentile >= Percentile2

GROUP BY Key1, Key2, Percentile;

I think that might get you close...

Hope this helps,

Jason

Report Inappropriate Content · ‎2012-04-02

Matt,

Load your data into a table (lowest level of granularity).

Then aggregate using a Resident table.

i.e.:

Tbl:

Load index, Cum1, Cum2, ...

From File1.qvd (qvd);

Tbl2:

noconcatenate

Load Index, Sum(Cum1) as Cum1, Sum(Cum2) as Cum2, ...

Resident Tbl group by index;

Drop Table Tbl;

Exit Script;

Unless you are trying to get an incremental aggregation, then use Excel (i.e. b2 = b1 + a2, b3= b2 + a3, etc.)

Regards,

jagan · ‎2012-04-03

Hi,

Use this sample script to find Cumulative Sum in script

TableName:

LOAD

F1,

F2,

RangeSum(F2, Peek('CumSum')) AS CumSum;

LOAD * INLINE [

F1, F2

a, 100

b, 200

c, 300

d, 400

];

Hope this helps you.

Regards,

Jagan.

philip_doyne · ‎2012-09-01

Thanks Jagan, A very useful tip. Just in case anyone is interested I have extened this to give me cumulative subtotals based on changes in F1.

Data:

LOAD * INLINE [

F1, F2, F3

a, 1,100

a, 2,200

b, 1,3000

b, 2,4000

c, 1,40000

c, 2,40000

];

TableName:

Result:

NoConcatenate

LOAD

F1,

F2,

F3,

RangeSum(F3, Peek('CumAll')) as CumAll,

if(F1<> previous(F1),F3,RangeSum(F3, Peek('CumF1'))) AS CumF1

RESIDENT Data;

DROP TABLE Data;

I found you need to be careful about the preceeding load so have done it in two passes - It worked for me on a real fife problem.

How can I calculate a cumulative sum in load script?