Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Set analysis to show Sales between a Starting and a Ending date

I want to show Sales from a specific period (user can select two variables: start date and end date) using Set analysis. I have managed to do this for one day:

sum( {$< SalesDate = {'$(StartDay)'} >} Sales)

How to do for a period?

sum( {$< SalesDate >= {'$(StartDay)' and SalesDate <= {'$(EndDay)'} >} Sales). This is just an idea of what I need, of course this expression that I wrote doesn't work (and I can't make it work).

Thank you.

1 Solution

Accepted Solutions
mike_garcia
Luminary Alumni
Luminary Alumni

Hi, Try

sum({$<SalesDate = {">= $(StartDay) <= $(EndDay)"}>} Sales)


(Notice the double quotes to indicate a search string)

Hope this helps.

Mike.

Miguel García
Qlik Expert, Author and Trainer

View solution in original post

24 Replies
mike_garcia
Luminary Alumni
Luminary Alumni

Hi, Try

sum({$<SalesDate = {">= $(StartDay) <= $(EndDay)"}>} Sales)


(Notice the double quotes to indicate a search string)

Hope this helps.

Mike.

Miguel García
Qlik Expert, Author and Trainer
Not applicable
Author

Thanks for the answer, Mike.

From what I have read in the reference manual, your formula looks pretty good at first glance.

The problem is that it is not working for me. Probably this is happening because the logical operators (greater than, less than, etc) are acting differentely on set analysis when dealing with dates and variables (my variables are defined as a predefined number series in input box, but shown as date - in Numbers tab).

For one date, I am using this fomula:

sum({$<SalesDate = {'$(StartDay)'}>} Sales)

and it is working.

The problem appears when I want to do the same thing, but for an interval.
johnw
Champion III
Champion III

I don't know if this is the problem or not, but all your dates need to be in the same format in the set analysis expression. So if, for instance, SalesDate is in format 'YY/MM/DD', but StartDay is in format 'DD/MM/YY', it wouldn't work as is. You would instead need to do something like this:

sum({$<SalesDate = {">=$(=date(StartDay,'YY/MM/DD')) <=$(=date(EndDay,'YY/MM/DD'))"}>} Sales)

Not applicable
Author

Thanks John. I need to pay more attention to have the same type of formatting. The expression given by Mike Garcia works very well.

gandalfgray
Specialist II
Specialist II

[quote user="Mike García"]

Hi, Try

sum({<SalesDate = {">= $(StartDay) <= $(EndDay)"}>} Sales)

(Notice the double quotes to indicate a search string)

Hope this helps.

Mike.

The above syntax will not give what you would expect if StartDay is after EndDay. (Try it and you will see what I mean Wink)

The correct way to write this is:

sum({$<SalesDate = {">= $(StartDay)"} * {"<= $(EndDay)"}>} Sales)
This will guarantee that you get 0 for sum(Sales) if StartDay is after EndDay

Not applicable
Author

Thanks Goran. Your expression is remarkable. I always assumed that the users will know to choose an EndDate which is greater than the StartDate. I don't have to assume this anymore.

gandalfgray
Specialist II
Specialist II

Well, of course you may be able to prohibit the users to select an EndDate smaller than the StartDate,

but the syntax I suggested works also for all kind of other sets. You will get the intersection of the two sets.

I am not sure really how the syntax suggested by Mike should be interpreted

Edit:

I experimented a bit and it seems that Mike's syntax works like INTERSECT ( *-operator ) as long as StartDate <= EndDate,

but when StartDate is larger than EndDate Mike's syntax works like UNION ( +-operator ) or like XOR ( /-operator ) (I can't decide which because they are the same for two non-overlapping sets)

mike_garcia
Luminary Alumni
Luminary Alumni

[quote user="Göran K"]I am not sure really how the syntax suggested by Mike should be interpreted

Let me try and explain the syntax, so that it can be properly understood and every developer can evaluate both approaches and decide which one to take.

The part

">= $(StartDay) <= $(EndDay)"
indicates a search string. Just as you would search on a list box containing numbers, for example. The string ">5<10" will give you numbers (if available) ranging from 6 to 9. That's just how it works and it is the same for dates.

If your search is something like ">10<5" (Start bgreater than End) it will return NO values, since there is no number that meets both criteria. So, that covers the point of "What if a user selects a StartDate greater than End Date?". It will just return zero, as the other approoach does.

And no, it does not work like Union Operator, nor does it work like XOR.

I did not take the time to do some testing but I believe that specifying ONE set (with one search string) is more efficient that specifying TWO sets and then use an INTERSECT operator.

Anyway, I've always liked the idea of having different solutions for the same problem and welcome new ideas.

Mike.

Miguel García
Qlik Expert, Author and Trainer
johnw
Champion III
Champion III

[quote user="Mike García"]I did not take the time to do some testing but I believe that specifying ONE set (with one search string) is more efficient that specifying TWO sets and then use an INTERSECT operator.

I haven't tested it either, but I'm pretty confident that you are correct based on my hopefully correct understanding of what's going on here. However, I'd also bet that Göran's expression WON'T be slower (or at least not noticeably), because QlikView should be smart enough to simply intersect the two search results for a single field, rather than intersecting two complete sets of data. Basically, I'm looking at three ways of doing this. Yours and Göran's should be fast, and the one I will introduce should be much slower:

sum({<SalesDate = {">= $(StartDay) <= $(EndDay)"}>} Sales)

This uses a single search expression that tells QlikView to "select" dates between the start and end date. So let's say you "select" three months out of 30 months. QlikView's normal associative logic is borrowed, and you almost immediately have the data you requested, about 10% of what is available.

sum({<SalesDate = {">= $(StartDay)"} * {"<= $(EndDay)"}>} Sales)

With this expression, QlikView will probably first use the search expression ">=$(StartDay)" to get a list of all SalesDates greater than or equal to the start day. That's a short list, so this is extremely fast. Then it will do the same for EndDay, which should also be extremely fast. Then it should intersect these two short lists. That should be extremely fast. Then it should proceed like it did in the previous expression, "selecting" those fields, borrowing QlikView's normal associative logic, and almost immediately giving you the 10% of data you wanted. It should be only a tiny bit slower if everything works like I suspect.

sum({<SalesDate = {">= $(StartDay)"}> * <SalesDate = {"<= $(EndDay)"}>} Sales)

THIS is where I suspect you could get into performance trouble, because I think QlikView will intersect two sets of data intead of just two lists of field values. So I suspect this will take over twice as long as the previous two. Let's say you selected the middle three months or so. QlikView "selects" all dates less than the End, or about 60% of your data, and that's data set one. Then QlikView "selects" all dates greater than the Start, or about 60% of your data, and that's data set two. Then QlikView does an intersection operation on the two sets, keeping the 10% of rows that are in both sets. Doing all of that is expensive, at least done "naively" like this, which is what I suspect it does.

Now, I could be wrong. For instance, QlikTech could have coded some pre-optimization that recognizes that this is an intersection of two sets for the same Field, and therefore intersect the values for the field FIRST, which should be much faster. And it might be fun to test and see. I just haven't.