Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
mgranillo
Specialist
Specialist

Count Distinct Accumulation in Qlik Sense

It seems there are a lot of posts covering count distinct accumulation over time but none of which worked for me.

Here is a solution I made that seems simple.

I sorted my table in the load script by my dimension (month in my case).  Then i used the following chart expression:

rangesum(

    above(

          sum(

                aggr(

                      count(distinct "Name")

              ,"Name")

              )

          ,0,RowNo())

)

Obviously, you'll need to replace "Name" with the column you are distinctly counting.  The result will be a nice accumulating line like below:

6e369a9e76d94bc5bc62cf046f2ff66a.png

EDIT: For those curious about how this is working, it is an unintended result from a grain mis-match on the aggr function (not recommended but it does work).  To explain, start by ignoring the "rangesum(above(" part of the expression.  This is the part needed for the final accumulation.  The inner part: "sum(aggr(count(distinct "Name"),"Name"))"  is where we see strange behavior.

Starting with our sample data:

LOAD * INLINE [

Month, Name

1,A

1,B

1,C

1,D

1,E

2,F

2,G

2,A

2,B

2,C

2,D

3,H

3,A

3,B

3,C

];

Qlik evaluates sum(aggr(count(distinct "Name"),"Name"))  by first applying it to our month dimension as in any chart (Month=1).  It then jumps to the next dimension we've specified in the aggr function, "Name", and evaluates all instances of the given name.  So "1,A", "2,A","3,A" are all evaluated for month =1 even though Month=1 is not the month associated with "2,A" and "3,A".  Count(distinct Name) is then applied to these three rows and returns a count of 1.  These steps are applied for each "Name" in Month =1 and returns a final count of 5.

Processing then jumps to month 2 and the evaluation steps continue but rows "2,A" through "2,D" are skipped because they were already evaluated when Month equaled 1.  The final count returned for Month=2 is 2 ("2,F" and "2,G" are counted).  Month 3 returns a count of 1 because all rows have been evaluated in previous steps with the exception of "3,H".  The final result is this table:

                                  

MonthDistinct Count by Month
15
22
31

The reason this method is not recommended is because Qlik doesn't know how to apply the inner dimension (Name) with the outer dimension (Month).  The load order is the only factor here that allows this to work.  We've essentially gotten lucky that our load order happens to make this work.  Had our table been loaded differently we would get a different result.

Say I move my second row, "1,B" to the bottom of my data load:

LOAD * INLINE [

Month, Name

1,A

1,C

1,D

1,E

2,F

2,G

2,A

2,B

2,C

2,D

3,H

3,A

3,B

3,C

1,B

];

The distinct count will now be 4,3,1 instead of the previous 5,2,1

MonthDistinct Count by Month
14
23
31

If we add in the accumulation part of the function ( "rangsum(above(" ) we arrive at the same final accumulation = 8 but the way we got there is different.  See the attached sample app for these two examples.

Happy Qliking!

Mike

6 Replies
sunny_talwar

Can you attach the sample to play around with

mgranillo
Specialist
Specialist
Author

Sunny,

I'll try to update this post today.  I just got some feedback from Henric Cronström on how this is working and I need to alter the expression and provide an explanation.  It works, but it's not recommended because it's a grain mis-match on the aggr function.

Mike

Emma1
Contributor III
Contributor III

Try this to get a plot of accumulated  distinct count o"Name" vs Month or whatever time range you please.

aggr(rangesum(

above(

sum(

aggr(

count(distinct "Name")

,"Name")

)

,0,RowNo())),Month,(Numeric))

)

meliscolak
Contributor
Contributor

thank you so much. you saved my world with this formula! 🙂

meliscolak
Contributor
Contributor

I want to make clear something, if you use this formula, you HAVE TO sort your data when you're loading it. Otherwise it can give silly results.

Purushothaman
Partner - Creator III
Partner - Creator III

@Emma1 Thanks for the expression. It was working in my scenario as well. Great help. Much appreciated.