Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Employee
Employee

The nodistinct qualifier and the Aggr function

Image Not found

There's a lot of content written about the Aggr function in the Qlik Community and in this blog as well.  No doubt why, Aggr is one of the most powerful, and sometimes intriguing, functions we have in Qlik.

 

Today, I would like to explain the use of nodistinct and Aggr and the importance of granularity with a simple example.

 

I have a table containing sales data, I load that table like this:

 

ProductData:

LOAD * inline [

Customer|Product|UnitSales|UnitPrice
Astrida|AA|4|16
Astrida|AA|10|15
Astrida|BB|9|9
Betacab|BB|5|10
Betacab|CC|2|20
Betacab|DD|25|25
Canutility|AA|8|15
Canutility|CC||19

] (delimiter is '|');

 

My goal is to display a table in Qlik (Sense or QlikView) that shows the sales by customer and product.  I need to add a column that contains the percentage that each product represents over the total sales to each customer, I called it Share, the resulting table should look like this:

goal.png

To create a table like that, I could use the Aggr function, and I could write an expression that calculates the sales amount for each pair “customer product” and then divide that number by the total sales amount by customer. 

 

sum(UnitSales*UnitPrice)/only(aggr(sum(UnitSales*UnitPrice), Customer))

 

 

If I use that expression, then the resulting table looks like the picture below, where only the first pair "customer product" contains a valid result. 

issue.png

 

 

Understanding granularity

 

The issue here is granularity. My chart has more dimensions - is more granular, contains Customer and Product -  than my Aggr function, only contains Customer.

 

The expression only(aggr(sum(UnitSales*UnitPrice), Customer) returns an array of three values, one for each customer. 

 

As you can see in the table above, when an aggr array containing three values goes into the chart with higher granularity, only the fist row for the parent dimension contains a valid expression. The rest of the rows are performing a calculation that returns null. Is performing something like "sum(UnitSales*UnitPrice) divided by null"

 

Using nodistinct

 

Qlik's Aggr help page: If the expression argument is preceded by the nodistinct qualifier, each combination of dimension values may generate more than one return value, depending on underlying data structure. If there is only one dimension, the aggr function will return an array with the same number of elements as there are rows in the source data. 

 

In other words, if the nodistinct qualifier is used, then our example's aggr array will contain 7 rows of data, it will repeat the value for each customer across the rows that contain that customer:

 

aggr.png

 

Finally, to complete our example and to get a table that looks exactly as the one requested, we just need to add the nodistinct qualifier to our Aggr function:

sum(UnitSales*UnitPrice)/only(aggr(nodistinct sum(UnitSales*UnitPrice), Customer))

 

I hope this makes sense and you find it useful.

Arturo (@arturoqv)

 

4 Comments
MVP & Luminary
MVP & Luminary

Your example demonstrates nicely the why and how of the Nodistinct qualifier in the aggr function. 

But! In the hope of preventing users from using the aggr function in the wrong situation, that means a case like this one, I have to add that here the Total qualifier should be used. That means this expression: 

sum(UnitPrice*UnitSales)/sum( TOTAL <Customer> UnitPrice*UnitSales)

For more information see this blog post: 

https://community.qlik.com/t5/Qlik-Design-Blog/When-should-the-Aggr-function-NOT-be-used/ba-p/146785...

1,735 Views
Employee
Employee

Right, I don't want to be the one disagreeing with HIC ;), so yes, an expression using TOTAL is better than Aggr function for this example (spoiler alert: I was planning to continue with this blog post's example and explain TOTAL in my next blog post).

 

But, if you don't need to display the totals row in the chart and/or you don't have a large dataset then I would say is just fine to use aggr for this use case scenario.

1,711 Views
anderseriksson
Contributor III

 Using Sum(TOTAL ...) has nothing to do with if you display total rows in the chart.

1,567 Views
christian77
Valued Contributor

In my opinion the most powerful function in Qlik. Allows to do calculation grouping with no need of a table. Yes, it takes time to calculate if they are many groups. It’s not compulsory to use it, anyway. If I want to do an average of a metric outside a dimension, I’ll probably use aggr function. The NODISTINCT modifier helps to mix dimensions and granularity giving the correct result. It can be used in a KPI where there is no dimension to use. Understanding this function place you a step ahead in Qlik, both data modeling and design.

I have solved so many problems with aggr function and NODISTINCT modifier that I couldn't live without it.

I say so many... many... many...

 

 

 

1,292 Views