Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Set Analysis: Evaluation Rules and Documentation?

I'm facing the following problem with set analyis:

Having selected a number of months, I would like to determine the difference of clients between my first and my last months.

I'm trying this: =count({month=${=max(month)}} distinct client) - count({month=${=min(month)}} distinct client)

The whole expression evaluates in ways I don't understand - currently I'm struggling to simply understand the rules the evaluation follow (not to mention when to use ${=max(..)} and when to use ${max(..)} or all the other formats I tried...)

I'n a nutshell: can anybody point me to a good and comprehensive description of set analysis (which goes well beyong the handbook) and which also covers the use of formulas in set analysis expressions? Thanks.

17 Replies
Not applicable
Author

What is probably needed is a preprocessing of Set Analysis instructions that get rid of the overloaded cryptic $, <>, {}, -, * / stuff and let us say what we want in english (as e.g. lots of SQL statements expect from us)

johnw
Champion III
Champion III


sparur wrote:Oleg, you are right, sometimes we can to use a Flag for count Year To Date values (Previous year to Date), but it very limited cases. For example, If I want to calc YTD bases on my year and month selections (not on real today date). Or If I want to get sum of values for some interval of monthes (from StartDate to EndDate). How can I get it without set analysis? etc...<div></div>


For YTD based on year and month selections, you can use an "AsOf" table that connects an AsOfYear and AsOfMonth to all the real Years and Months.

AsOfYear, AsOfMonth, Year, Month
2010, Mar, 2010, Mar
2010, Mar, 2010, Feb
2010, Mar, 2010, Jan

A more complicated script can yield simpler charts in this way. Another advantage is that it will work for a chart with multiple months shown, where set analysis must have a specific month and year selected, since a set is only evaluated once per chart. Honestly, in practice, I use very little set analysis in my applications. It's a nice tool to have available when you need it, but there are often alternatives.

sparur
Specialist II
Specialist II

John, could you upload an example of your approach. I don't understand usability of your solution. Thank you.

Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Anatoly,

Set Analysis was only introduced with version 8.5, a couple of years ago, while the problems that you are describing, had always existed. While Set Analysis is a very cool technique, it's not equally good for all problems. Developers should always keep in mind the limitations of Set Analysis and the fact that, even though Set Analysis works very fast, it's still a calculation that needs to happen in the "Run time", in front of the waiting User.

If there is an option to simplify the Run-time calculation and move some of the "heavy lifting" to the load time without sacrificing flexibility - those options should be always considered, and many times preferred over a complex Set Analysis condition. Hence - using Flags as end results of pre-calculated conditions coming from the script.

John mentioned the "as of" technique... As soon as I have a bit of spare time, I'll write a detailed blog about it. This issue seems to be very popular...

cheers,

johnw
Champion III
Champion III

I've updated one of my examples include several YTD approaches. The first (the fourth chart down) still uses set analysis, but only to pull the right type of date relationship from our generic AsOf table, sum({<YTDFlag={1}>} Revenue). The second (fifth chart) does away with set analysis completely by using a specific YTD As Of table. Mind you, for these charts, we could have done the YTD without any of this overhead by just setting it to full accumulation (last chart), but that's not always possible.

A search for AsOf or AsOfMonth will return a several other examples.

Edit: Wait, no, full accumulation isn't as good. It works fine when nothing is selected, but try selecting April through June, for instance. Your full accumulation now STARTS with April, even though we probably wanted it to start with January, since this is supposed to be a year to date number. The AsOf tables don't have this problem. They always include the full year to date, no matter which months you select from the chart.

Not applicable
Author

Wow! - thanks John, great stuff

I tried to follow your expressions and for the Island approach (first chart) I can't really see the reason for the outer if



if(sum(if(MonthYear<=ChartMonthYear and MonthYear>=addmonths(ChartMonthYear,-2),Revenue))
,sum(if(MonthYear<=ChartMonthYear and MonthYear>=addmonths(ChartMonthYear,-2),Revenue)))


what case is handled with this if? Tried without it and it showed same result.

Thanks

Juerg



johnw
Champion III
Champion III

Hmmm... I don't remember why I did that. The only thing it would seem to do is turn a 0 into a null, and thus exclude it from the average. So if one of the three months had no revenue, we would only average the other two. Oh! And that's exactly why it's there. For the first two months (January and February 2008), we will only have one month and two months of data. No, wait, it's even more critical than that. If we don't do it, the average is over EVERY month, not just the 1, 2 or 3 months. It looks almost the same because the scale drops by a factor of slightly over 10, so the chart looks exactly the same except the first two data points of the first year, and the Y axis labels drop a 0.

johnw
Champion III
Champion III

New version with synthetic key removed. It worked fine with the synthetic key, but there was absolutely no reason to have Month or Year on any of the AsOf tables. It was just wasting time and space and complicating things.

Edit: Looking it over, I'm strongly in favor of the generic date linkage table approach. Try this experiment - select a month in each of the tables in sequence and see what happens.

  • With the date island approach, nothing else is affected. That's fine. I think it's what we want. I have other reasons for not liking date islands.
  • With the generic date linkage table approach, the two tables using this approach are affected. To me, that makes sense - as of this date I selected, what is my three month rolling average and my year to date data? These could easily be combined into one table, for instance.
  • With the specific date linkage table approach, things get screwy. It changes every other table, and in ways we probably weren't intending.
  • With plain old accumulation, it's even worse. It affects every table, though I suppose you could argue that it should. But it also stops accumulating. If you select June, you don't see June YTD, you just see June. I think I already mentioned that earlier in the thread, but figured I should consolidate my comments a bit.