Skip to main content
Announcements
Qlik Community Office Hours - Bring your Ideation questions- May 15th, 11 AM ET: REGISTER NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Cumulative Count

I'm trying to do a cumulative count distinct by date. For example, given this data:

Date, Userid
2/1, A
2/1, B
2/2, C
2/3, A

I want to create this chart:

Date, CumDistinct Userid
2/1, 2
2/2, 3
2/3, 3

It seems like a set analysis problem. I want an expression to do COUNT(DISTINCT Userid) against a set where the set is defined as "Date <= the Date of the current row".

Can't find the correct expression. Any ideas?

Thanks,

Rob

1 Solution

Accepted Solutions
johnw
Champion III
Champion III

If I understand the problem, set analysis isn't the solution. A set is only analyzed once for the entire chart, not once for each row. So unless you always select only a single date at a time, you won't get the right results.

The most common solution might be the date island approach. Just use a different date field like OtherDate, use THAT as your dimension, and count(distinct if(Date<=OtherDate,Userid)). Performance will be bad on large data sets, of course.

You might be able to do the accumulation in the script. Something like rangesum(peek(MyCount),if(exists(Userid),0,1)) as MyCount, and then I guess create a separate table where you group by date and take the max(MyCount). But if you want the count to differ by selections, that's not going to do.

You could create a linkage table that links each date to every Userid encountered on or before that date. You'd probably actually work in the opposite direction, joining the first encounter of each user ID to all dates in the calendar on and after that date. That could be a lot of data, though, on top of being rather complicated to set up.

View solution in original post

6 Replies
johnw
Champion III
Champion III

If I understand the problem, set analysis isn't the solution. A set is only analyzed once for the entire chart, not once for each row. So unless you always select only a single date at a time, you won't get the right results.

The most common solution might be the date island approach. Just use a different date field like OtherDate, use THAT as your dimension, and count(distinct if(Date<=OtherDate,Userid)). Performance will be bad on large data sets, of course.

You might be able to do the accumulation in the script. Something like rangesum(peek(MyCount),if(exists(Userid),0,1)) as MyCount, and then I guess create a separate table where you group by date and take the max(MyCount). But if you want the count to differ by selections, that's not going to do.

You could create a linkage table that links each date to every Userid encountered on or before that date. You'd probably actually work in the opposite direction, joining the first encounter of each user ID to all dates in the calendar on and after that date. That could be a lot of data, though, on top of being rather complicated to set up.

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP
Author

Thanks for the response John. The date island approach works. I'll have to see if I can get acceptable performance when I scale up.

As you correctly guessed, I can't take the script approach because the count must consider selections.

Thanks,
-Rob

Not applicable

Hi Rob & John,

I'm pretty new to Qlik view, have just started and i'm trying to learn. I am also trying to make cumulative count to DAY(which is a dimension) of customers added in the current month.

I have:

DAY CustomerID

1 2364

1 6726

1 4627

2 5399

2 4627

3 5391

I wanted to have a cumulative count till each day on the chart for current month, but it doesn't work with the date island approach or may be i didn't understand it properly.

Please help.

Thanks

Sandeep

johnw
Champion III
Champion III

If I understand what you're after, and you have an add date for each customer, it's easy. Your dimension is AddDate. Your expression is count(Customer). Select "Full Accumulation".

Well, might be a little more to it since you want only the current month. Probably an expression like count({<AddMonth={'$(=date(monthstart(today()),'MMM YYYY'))'} Customer), though it'll depend on how you define your add month.

Or you could leave the expression alone, and use a calculated dimension, like if(AddMonth=monthstart(today()),AddDate), and then suppress null values in the dimension to remove everything but the current month.

See attached.

Not applicable

Thanks for the solution!

Anonymous
Not applicable

Hi John,

I tried island approach but it doesn't work. Have any idea why not working?

Here my work:

comparework.PNG

And my tables:

MATERIAL:

LOAD * INLINE [

    PERIOD, ID

    201501, 12

    201501, 13

    201501, 14

    201502, 12

    201502, 15

];

DATE:

LOAD DISTINCT PERIOD AS PERIOD2

RESIDENT MATERIAL;

Thanks a lot.