Re: AGGR OPTIMIZATION - Qlik Community

somebodysomeone · ‎2018-11-21

Hello there,

I have 40 millions records and 5 millions unique customer ids. Aggr below is working true but slow. What can do to make it faster? I want to aggr customers into 3 parts according to their volumes change between two dates: increase, decrease, zero chance.

=if(aggr(( sum({<FIRSTDATE={"$(=date(T1),'DD.MM.YYYY'))"}>} VOLUME) -
sum({<SECONDDATE={"$(=date(MAX(T2),'DD.MM.YYYY'))"}>} VOLUME) ),[CUSTOMERID])>0, 'INCREASE',
if(aggr(( sum({<FIRSTDATE={"$(=date(T1),'DD.MM.YYYY'))"}>} VOLUME) -
sum({<SECONDDATE={"$(=date(MAX(T2),'DD.MM.YYYY'))"}>} VOLUME) ),[CUSTOMERID])<0, 'DECREASE',
'ZERO'))

I cannot aggr it on cript or qvd files because there are many dates and between them there are many combinations. I have to calculate and aggr it in table.

Ty for ur help.

marcus_sommer · ‎2018-11-21

The expression could be optimized by removing the second aggr, for example with something like this:

pick(match(

sign(aggr( sum({<FIRSTDATE={"$(=date(T1),'DD.MM.YYYY'))"}>} VOLUME) -
sum({<SECONDDATE={"$(=date(MAX(T2),'DD.MM.YYYY'))"}>} VOLUME),[CUSTOMERID])),

-1, 0, 1), 'DECREASE', 'ZERO', 'INCREASE')

If this is further too slow you will need to consider changes within your datamodel to reduce the number of tables to which the involved fields belong to and also their associations (means developing the datamodel in the direction of a star-scheme or a big flat-table).

- Marcus

somebodysomeone · ‎2018-11-21

Unfortunately, it s working but slower than previous version 😕

I have a star scheme and it has just 4 dimensions.

marcus_sommer · ‎2018-11-21

Usually the above mentioned approach of removing redundant calculations of an expression improved the performance (an if-loop calculates always all branches completly and the true/false check of the branches is applied afterwards).

Nevertheless in your datamodel might be options to optimize theses expression. If the expression used the fields from all 4 dimensions and the fact-table the created virtual table on which the calculation will be performed might be quite heavy - and although the aggregations like the sum() here is multi-threaded the creation of the virtual table is a single-threaded process. More to this could you find here: Logical-Inference-and-Aggregations

- Marcus

AGGR OPTIMIZATION

aggrigation

General Question

OPTIMIZATION

Qlik Sense