Skip to main content
Announcements
Global Transformation Awards! Applications are now open. Submit Entry
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How can I calculate a cumulative sum in load script?

I have a (large) dataset being loaded from text with a composite key and several metric columns. I would like to include a cumulative sum (similar to "full aggregation" in a visualization, but the full aggregation option does not allow me to perform all the calculations and interactions I need).

The general structure of the dataset is:

key1, key2, ... : identifiers for this grouping

percentile : 10 values per key 10, 20, ..., 100

metric1, metric2 : counts of interactions in each percentile block that I would like to aggregate

If I were doing this in sql, I would do it as a self-join -- ex:

select t1.key1

, t1.key2

, t1.percentile

, sum(t2.metric1) as cum_sum_metric1

, sum(t2.metric2) as cum_sum_metric2

from table t1

inner join table t2

on t1.key1=t2.key1

and t1.key2=t2.key2

and t1.percentile >= t2.percentile

group by t1.key1

, t1.key2

, t1.percentile

Is there a way to replicate this behavior in the Qlikview load script? Alternatively, if there is another way to accomplish this end, I am open to other suggestions.

Thanks,

--Matt

4 Replies
Jason_Michaelides
Luminary Alumni
Luminary Alumni

I don't think you can tell QlikView how to join tables directly so you have to get a bit sneaky about it. I think something like the following would work but it's untested...

Tab1:

Load

   Key1

   ,Key2

   ,Percentile

   ,etc

;

SQL Select * From Table1;

INNER JOIN (Tab1)

Load

   Key1

   ,Key2

   ,Percentile AS Percentile_T2

   ,Metric1

   ,Metric2

;

SQL Select * From Table2;

This will join all rows where Key1 and Key2 both match.

Now you can group and sum, but judging by your query above you only want to do this where Percentile >= Percentile2, so:

LEFT JOIN (Tab1)

Load

   Key1

   ,Key2

   ,Percentile

   ,Sum(Metric1) AS Accum1

   ,Sum(Metric2) AS Accum2

RESIDENT Tab1

WHERE Percentile >= Percentile2

GROUP BY Key1, Key2, Percentile;

I think that might get you close...

Hope this helps,

Jason

Not applicable
Author

Matt,

Load your data into a table (lowest level of granularity).

Then aggregate using a Resident table.

i.e.:

Tbl:

Load index, Cum1, Cum2, ...

From File1.qvd (qvd);

Tbl2:

noconcatenate

Load Index, Sum(Cum1) as Cum1, Sum(Cum2) as Cum2, ...

Resident Tbl group by index;

Drop Table Tbl;

Exit Script;

Unless you are trying to get an incremental aggregation, then use Excel (i.e. b2 = b1 + a2, b3= b2 + a3, etc.)

Regards,

jagan
Luminary Alumni
Luminary Alumni

Hi,

Use this sample script to find Cumulative Sum in script

TableName:

LOAD

    F1,

    F2,

    RangeSum(F2, Peek('CumSum')) AS CumSum;

LOAD * INLINE [

    F1, F2

    a, 100

    b, 200

    c, 300

    d, 400

];

Hope this helps you.

Regards,

Jagan.

philip_doyne
Partner - Creator II
Partner - Creator II

Thanks Jagan,  A very useful tip.   Just in case anyone is interested I have extened this to give me cumulative subtotals based on changes in F1.

Data:

LOAD * INLINE [

    F1, F2, F3

    a, 1,100

    a, 2,200

    b, 1,3000

    b, 2,4000

    c, 1,40000

    c, 2,40000

];

TableName:

Result:

NoConcatenate

LOAD

    F1,

    F2,

    F3,

    RangeSum(F3, Peek('CumAll')) as CumAll,

    if(F1<> previous(F1),F3,RangeSum(F3, Peek('CumF1'))) AS CumF1

RESIDENT Data;

DROP TABLE Data;

I found you need to be careful about the preceeding load so have done it in two passes - It worked for me on a real fife problem.