Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a (large) dataset being loaded from text with a composite key and several metric columns. I would like to include a cumulative sum (similar to "full aggregation" in a visualization, but the full aggregation option does not allow me to perform all the calculations and interactions I need).
The general structure of the dataset is:
key1, key2, ... : identifiers for this grouping
percentile : 10 values per key 10, 20, ..., 100
metric1, metric2 : counts of interactions in each percentile block that I would like to aggregate
If I were doing this in sql, I would do it as a self-join -- ex:
select t1.key1
, t1.key2
, t1.percentile
, sum(t2.metric1) as cum_sum_metric1
, sum(t2.metric2) as cum_sum_metric2
from table t1
inner join table t2
on t1.key1=t2.key1
and t1.key2=t2.key2
and t1.percentile >= t2.percentile
group by t1.key1
, t1.key2
, t1.percentile
Is there a way to replicate this behavior in the Qlikview load script? Alternatively, if there is another way to accomplish this end, I am open to other suggestions.
Thanks,
--Matt
I don't think you can tell QlikView how to join tables directly so you have to get a bit sneaky about it. I think something like the following would work but it's untested...
Tab1:
Load
Key1
,Key2
,Percentile
,etc
;
SQL Select * From Table1;
INNER JOIN (Tab1)
Load
Key1
,Key2
,Percentile AS Percentile_T2
,Metric1
,Metric2
;
SQL Select * From Table2;
This will join all rows where Key1 and Key2 both match.
Now you can group and sum, but judging by your query above you only want to do this where Percentile >= Percentile2, so:
LEFT JOIN (Tab1)
Load
Key1
,Key2
,Percentile
,Sum(Metric1) AS Accum1
,Sum(Metric2) AS Accum2
RESIDENT Tab1
WHERE Percentile >= Percentile2
GROUP BY Key1, Key2, Percentile;
I think that might get you close...
Hope this helps,
Jason
Matt,
Load your data into a table (lowest level of granularity).
Then aggregate using a Resident table.
i.e.:
Tbl:
Load index, Cum1, Cum2, ...
From File1.qvd (qvd);
Tbl2:
noconcatenate
Load Index, Sum(Cum1) as Cum1, Sum(Cum2) as Cum2, ...
Resident Tbl group by index;
Drop Table Tbl;
Exit Script;
Unless you are trying to get an incremental aggregation, then use Excel (i.e. b2 = b1 + a2, b3= b2 + a3, etc.)
Regards,
Hi,
Use this sample script to find Cumulative Sum in script
TableName:
LOAD
F1,
F2,
RangeSum(F2, Peek('CumSum')) AS CumSum;
LOAD * INLINE [
F1, F2
a, 100
b, 200
c, 300
d, 400
];
Hope this helps you.
Regards,
Jagan.
Thanks Jagan, A very useful tip. Just in case anyone is interested I have extened this to give me cumulative subtotals based on changes in F1.
Data:
LOAD * INLINE [
F1, F2, F3
a, 1,100
a, 2,200
b, 1,3000
b, 2,4000
c, 1,40000
c, 2,40000
];
TableName:
Result:
NoConcatenate
LOAD
F1,
F2,
F3,
RangeSum(F3, Peek('CumAll')) as CumAll,
if(F1<> previous(F1),F3,RangeSum(F3, Peek('CumF1'))) AS CumF1
RESIDENT Data;
DROP TABLE Data;
I found you need to be careful about the preceeding load so have done it in two passes - It worked for me on a real fife problem.