Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm facing the following problem with set analyis:
Having selected a number of months, I would like to determine the difference of clients between my first and my last months.
I'm trying this: =count({month=${=max(month)}} distinct client) - count({month=${=min(month)}} distinct client)
The whole expression evaluates in ways I don't understand - currently I'm struggling to simply understand the rules the evaluation follow (not to mention when to use ${=max(..)} and when to use ${max(..)} or all the other formats I tried...)
I'n a nutshell: can anybody point me to a good and comprehensive description of set analysis (which goes well beyong the handbook) and which also covers the use of formulas in set analysis expressions? Thanks.
Hi,
Have a look at the attached thread.
Hope this will help you.
http://community.qlik.com/forums/p/28667/110280.aspx#110280
- Peterson
I don't have anything better to point you at than the help text, but I can try to give some basic information that is relevant to your example.
First, you'll want to understand HOW set analysis does the comparison. In this case, you're trying to "select" a specific month. You do this by telling it which month to select. But it isn't like a normal =, a normal date comparison. Instead, set analysis behaves as if what you were typing was going in a search expression in a list box. One thing to take from that is that it will always therefore be a string sort of comparison, not a numeric or date sort of comparison.
In other words, if "month" is defined as the first day of the month, and formatted as 'MMM YYYY', so it shows up in your list box like 'Jun 2010', you somehow have to end up with that TEXT inside of the set analysis expression to match it.
Something else you'll want to know that will put you ahead of a lot of people is the difference between using single quotes and double quotes.
month={Jun 2010} <-- this won't work
month={'Jun 2010'} <-- this is the LITERAL 'Jun 2010'
month={"Jun 2010"} <-- this is the SEARCH EXPRESSION "Jun 2010"
What's the difference? Well, not much when you're just specifying a single value like this. Searching for Jun 2010 will give the same result as just selecting the literal Jun 2010. But I believe you MUST use literals if you're specifying multiple values:
month={'Jun 2010','Jul 2010'}
And you MUST use a search expression when you're searching for a range:
month={'<=Jun 2010'} <-- this won't work; it's looking for '<= Jun 2010' as a literal value for month
month={"<=Jun 2010"} <-- yes, it's strange, but this works even with the space because it's a search expression
So best to just get in the habit of using the right kind of quotes for the right situation.
Now let's apply what we've learned. We want to use max(month) somehow. But we're going to need to do that as our very first step, before we start messing around with restricting our month value. We also can see that we need to literally insert our month into the expression, because "max(month)" isn't a valid search expression, and 'max(month)' certainly isn't the literal we're looking for. Fortunately, a dollar sign expansion covers all those bases. So as a first pass, we're looking for something like this:
$(=max(month))
Now, what happens if you use that expression? Well, max(month) returns a date field, but fortunately, it's a formatted Date field, and as best I can tell, it preserves the format of the "month" field. If it didn't, and not all expressions work so nicely, we'd need to do this to format it manually, so that's often a good first thing to try if you're having trouble:
$(=date(max(month),'MMM YYYY')))
The dollar sign expansion actually literally returns your maximum month, Jun 2010 for the sake of argument. NOT a date or a month. Fortunately, that's just what we need since set analysis can't do a date or numeric comparison.
However, it has a space in it. That won't work, because QlikView won't realize it's a literal. We have to enclose it in single quotes to tell QlikView that it is a literal:
'$(=max(month))'
That should LITERALLY return this when executed:
'Jun 2010'
Which we've already seen we can use in set analysis. So now we just plug our expression so far where we had 'Jun 2010' in the example much further up:
month={'$(=max(month))'}
I would presume you want that to be the maximum of the months SELECTED. If you want the maximum of ALL months, you'll need to use set analysis INSIDE of the max(month) expression:
month={'$(=max({1}month))'}
I'm also assuming that the ONLY thing we're trying to override is the month. If you've selected five clients, you only want to count from THOSE clients, not from all clients. The set expression to do that is this:
{$<month={'$(=max(month))'}>}
The dollar sign here is NOT dollar sign expansion, but a way of telling set analysis that you're modifying the currently-selected set of data, not starting over. However, that's also the default, so you can leave the $ off. I always leave it off, in the hopes of avoiding confusion with dollar sign expansion.
{<month={'$(=max(month))'}>}
So we're almost there. Now you just have to add the count(distinct client):
count({<month={'$(=max(month))'}>} distinct client)
And of course do your desired subtraction:
count({<month={'$(=max(month))'}>} distinct client)
- count({<month={'$(=min(month))'}>} distinct client)
With any luck, that should do it.
I've have learnt a lot from this.
Thanks for sharing John.
Hi,
John's answer is a great help. There is an additional step that can help make your set analysis easier to scan, and gives a little more flexibility if you also wish to exclude any date selection you may have made, or make changes to several similar set analysis formulae without having to traul through all your charts when you need to make a change:
Create an input box with 2 new variables (e.g. 'MinDate' and 'MaxDate')
For MinDate:
In the 'Constraints' tab change it to read-only, and set the Value to:
=TimpStamp(min(Month),'MMM YYYY')
or:
=TimeStamp(min({1} Month),'MMM YYYY')
depending on whether you wish to include or exclude your date selection.
Then set MaxDate in a similar fashion using max instead of min.
You can then hide this box in the 'Layout' tab with a conditional show of 1=2.
Now go back to your set analysis as John has described above using '$(MinDate)' instead of '($(=min(month))'. Again do the same for MaxDate.
This may seem like a long way round, but the principle can help you make lots of changes at the same time if charts are replicated, and code will be easier to scan when things start to get more complicated (as they usually will!).
It will also remove the problem you get where analysis is performed on each individual bar on a chart, and you are trying to use a linked variable that lies outside of the range of the bar in the chart (this can help in your formulae outside of the set analysis too).
Hope this helps,
Craig
Hello, Jonh (and others QV guru's).
I created a small example for checking your statement:
Something else you'll want to know that will put you ahead of a lot of people is the difference between using single quotes and double quotes.
month={Jun 2010} <-- this won't work
month={'Jun 2010'} <-- this is the LITERAL 'Jun 2010'
month={"Jun 2010"} <-- this is the SEARCH EXPRESSION "Jun 2010"
What's the difference? Well, not much when you're just specifying a single value like this. Searching for Jun 2010 will give the same result as just selecting the literal Jun 2010. But I believe you MUST use literals if you're specifying multiple values:
month={'Jun 2010','Jul 2010'}
And you MUST use a search expression when you're searching for a range:
month={'<=Jun 2010'} <-- this won't work; it's looking for '<= Jun 2010' as a literal value for month
month={"<=Jun 2010"} <-- yes, it's strange, but this works even with the space because it's a search expression
So best to just get in the habit of using the right kind of quotes for the right situation.
and I see a slightly different behavior of the system... I see that expression with single quotes is work also.
Can you look at my example and explain me
What have I done wrong? or is it your explanation is not entirely correct?
You're right.
Set analysis has been surprising me today. I was wrong in another thread when I suggested Month={"<=$(=Month)"}. It works for my normal definition of month, date(monthstart(Date),'MMM YYYY'), but not for a some other definitions of month, like date(monthstart(Date),'MMMM') and month(Date). I honestly don't know why those would confuse QlikView when the definition I use for month does not. But they do.
And it looks like we have a similar situation here. I extrapolated a bit from what I've seen and from what I've read. I should instead have verified with a sample application that everything behaved like I was saying, or I should have shut up, because suddenly I'm getting a lot of things wrong.
Here's what the help text says:
element ::= fieldvalue | " search_mask "
'<=Jun 2010' is NOT a field value, and it is not a search mask in double quotes. It is a search mask in single quotes, which is not an allowed syntax in the help text. So it should not work. But obviously it does. So I would say there's an error in the help text in this case, or I'm even more wrong, now about what a "field value" is or what a "search mask" is. Actually, now I'm kind of frightened. What if my field LITERALLY has a value '<=Jun 2010'. What then?
Turns out then I'm in trouble, and QlikView can't distinguish between looking for a literal and specifying a search expression since it's ignoring its own specified syntax. I'd call that a bug. I did this:
LOAD
dual(Text,Number) as Dual
,Number
INLINE [
Text,Number
>=1,1
>=5,2
>=10,10
];
And I get these results:
sum({<Dual={'>=5'}>} Number) = 10 <-- should be 2?
sum({<Dual={">=5"}>} Number) = 10
I think what may have happened is that QlikTech tried to get TOO clever. They tried to make it so you didn't have to worry about what kind of quotes you put something in. But in the process of doing so, they BROKE the ability to distinguish between literals and search expressions when it really matters. In the first expression above, I am intending to find the LITERAL '>=5', which has a numeric value of 2. Instead, it decided that even though I put it in single quotes, I was writing a search expression, so it searched for >=5 instead. That does not seem like the right behavior to me, and does not seem to match the help text. Not sure I'll bother reporting it, though, because I'm almost guaranteed to get "working as designed", which drives me even more insane.
So at this point, I'm just feeling frustrated and annoyed with the set analysis implementation. I mean, I'm NORMALLY annoyed by it, in that I would like it to be able to do simple comparison, just like you can do with an if(). I can work around it, and I'm GLAD that it supports search expressions, but I'm annoyed that it can't support just basic comparisons, I'm annoyed at how clunky the syntax is, and I'm annoyed that the behavior does not obey the specified syntax.
Sorry. Cranky today. Thanks for double checking what I said, Anatoly. I HATE giving out misinformation. Apologies to everyone.
Agree with both John and Anatoly - Set Analysis can be quite unpredictable when it comes to evaluating Dates, Months and such.
Knowing that it does, we can always find our way around it, using simpler numeric conditions. You probably noticed by now, my favorite solution for Dates is using Flags, but there are many other ways...
Back to the original question - the most comprehensive documentation of Set Analysis is, unfortunately, the Help Section article. It's 5 pages of definitioned and examples, and I used to print it out every time I needed to work with Set Analysis, until the terms actually settled in my head. If you take the time to read the definitions and follow the examples, it will actually make sense, and you will see the difference between single quotes and double quotes, and when to use $(...) vs. $(=...). It's all there, just not in a lot of detail, so you actually need to pay attention to the examples...
Set Analysis deserves to be described in a separate book. Maybe one day someone will find the time to write one...
John, I like you and Oleg very disappointed at the behavior of the system. Should we pay attention to the development team from QlikTech to this? I think that such information is verified by such respected people on the community forum as you and Oleg must be heard and maybe we will get a more detailed description, or may fix this in the system.
Colleagues, what do you think about it?
Maybe Jason Long will help us bring this issue to the development team? Jason Can you hear us? 🙂
Oleg, you are right, sometimes we can to use a Flag for count Year To Date values (Previous year to Date), but it very limited cases. For example, If I want to calc YTD bases on my year and month selections (not on real today date). Or If I want to get sum of values for some interval of monthes (from StartDate to EndDate). How can I get it without set analysis? etc...