Solved: Aggr - Where to place conditional set analysis - Qlik Community

Anonymous · ‎2016-01-27

I have an aggr calculation that is working for me:

avg(aggr(DPO,[Invoice ID]))

However, I want to add some additional criteria in here such as DataType= 'A' and FiscalPeriod='201502'.

I have experimented but I am not sure where to put the additional filtering statements.

Any help is appreciated.

Thank you,

- dave

swuehl · ‎2016-01-27

Try

=avg({<DateType = {A},FiscalPeriod={201502}>} aggr(Only({<DateType = {A},FiscalPeriod={201502}>}DPO),[Invoice ID]))

View solution in original post

swuehl · ‎2016-01-27

Try

=avg({<DateType = {A},FiscalPeriod={201502}>} aggr(Only({<DateType = {A},FiscalPeriod={201502}>}DPO),[Invoice ID]))

sunny_talwar · ‎2016-01-27

May be this:

=Avg({<DataType= {'A'}, FiscalPeriod={'201502'}>}Aggr(Only({<DataType= {'A'}, FiscalPeriod={'201502'}>}DPO), [Invoice ID]))

Oleg_Troyansky · ‎2016-01-27

Hi Dave,

this is something that every student at the Master Summit for Qlik learns from my lecture about AGGR and Set Analysis:

When AGGR and Set Analysis are used together in a nested aggregation, the Set Analysis condition needs to be placed in BOTH the outer and the inner aggregation functions, or else the condition may not work properly.

Your formula, however, doesn't look like a nested aggregation, and yet it is. When you don't specify any aggregation function in a place where it's expected (like within an AGGR, for example, the default aggregation ONLY() is presumed. So, your formula should really look like this:

avg(

aggr(

only(DPO)

,[Invoice ID])

)

Now, in this nested aggregation you should add your Set Analysis twice:

avg( {<DataType= {'A'} >}

aggr(

only( {<DataType= {'A'} >} DPO)

,[Invoice ID])

)

cheers,

Oleg Troyansky

Check out my new book QlikView Your Business

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

maxgro · ‎2016-01-27

see here, at point 3)

Pitfalls of the Aggr function

maybe (replace ??? with an aggregation, ie sum, min, max, ...)

avg({$ <DataType={A}, FiscalPeriod={201502}>} aggr({$ <DataType={A}, FiscalPeriod={201502}>} ???(DPO), [Invoice ID]))

rwunderlich · ‎2016-01-27

Others have given the syntax, but I might ask why use aggr() at all? Why not just

avg({<DateType = {'A'},FiscalPeriod={'201502'}>} DPO)

As I read the aggr() you posted, it would avg DPO values when there was only one DPO value for that [Invoive ID]. Is that what you wanted or is that the normal state of affairs?

-Rob

Anonymous · ‎2016-01-27

Rob -

My main data source is an invoice lines file. The DPO value is assigned at the invoice header level, so the relationship between invoice ID and DPO is 1 to 1. However, the relationship between invoice id and invoice lines is one to many. So when I was using the avg() without the aggr, my numbers were skewed as the DPO values were repeated depending on the number of lines on each invoice.

Therefore, I needed to make sure I was calculating the avg DPO value ONLY on distinct invoice IDs.

Oleg_Troyansky · ‎2016-01-27

Sounds like you could be better off with a weighted average calculation, rather than a simple average...

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

swuehl · ‎2016-01-27

My main data source is an invoice lines file. The DPO value is assigned at the invoice header level, so the relationship between invoice ID and DPO is 1 to 1. However, the relationship between invoice id and invoice lines is one to many. So when I was using the avg() without the aggr, my numbers were skewed as the DPO values were repeated depending on the number of lines on each invoice.

Therefore, I needed to make sure I was calculating the avg DPO value ONLY on distinct invoice IDs.

I am not sure I do understand this.

Avg(DPO) should be performed in the scope of the table where DPO is located, so not using duplicated values per invoice line.

In which context are you using this expression and / or are you using some modifications of the discussed expression?

Anonymous · ‎2016-01-27

Swuehl - An example is below:

INVOICE ID, DPO, LINE NUMBER

001,20,1

001,20,2

001,20,3

002,30,1

002,30,2

The avg(DPO) for above gives me 24

What I want is an average DPO based on DISTINCT INVOICE IDs, so the avg DPO should be 25 in this example.

The DPO value is a created field in my data model. There is not a separate table that lists INVOICE IDs and DPO with a 1 to 1 relationship.