Solved: Dimension as value in a set analysis expression - Page 3 - Qlik Community

Report Inappropriate Content · ‎2011-04-19

Hello,

I've got the problem, that users would like to have values of dimensions in a set analysis expression, so that they can get a table like this:

Is there an opportunity to have an expression like this for the column "Sales NewCustomer":

Sum ({$<SalesDate={'>=NewCustomerFrom <=NewCustomerTill'} >} Sales)

Thanks!

Peggy

rbecher · ‎2011-04-19

I wonder why this is not working on set analysis...

- Ralf

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine

chris_johnson · ‎2011-04-19

I wish I knew, it'd help me a lot on another report I've done. Like Peggy mentioned, it seems to resolve the $(num(date)) to blank so it shows all of the entries (you can see this in the example).

I'm sure there must be a way to do it, I just can't figure it out!

Regards,

Chris

nathanfurby · ‎2011-04-19

Isn't this to do with the fact that within the dollar expansion it ignores dimensions etc.

Therefore would only ever work if you were using the Only() statement to return just one value for the field. As proof I tried the statement below and it works, although I imagine it will be returning the Max Dates from all dimensions. In this example the numbers just happen to be correct:

sum({<SalesDate={">=$(=Max(NewCustomerFrom)) <=$(=Min(NewCustomerTill))"}>} Sales)

rbecher · ‎2011-04-19

I think this is very strange, but useful!

- Ralf

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine

johnw · ‎2011-04-19


Ralf Becher wrote:I wonder why this is not working on set analysis...

I haven't carefully read the whole thread, but a set is only generated once for the entire chart, not once per row of the chart. Therefore, when you refer to a chart's dimensions in the set, if there is more than one value possible for that dimension, the dimension reference will return null. I assume that's the problem here.

And yes, any use of $() will ALSO only be evaluated once, and before the rest of the set analysis. But again, the set analysis will only be evaluated once, with or without $().

There IS a way around this, but it's rather complicated and inflexible so I don't recommend it unless you've exhausted just about every other option. See this wiki entry:

http://community.qlik.com/wikis/qlikview-wiki/evaluating-sets-in-the-context-of-a-dimension.aspx

Attached is an example using that approach for a two-dimensional chart where one of the dimensions is cyclic. Hopefully a glance through the relevant section of the script will scare everyone away from using this approach.

nathanfurby · ‎2011-04-20

OK - so I've never had cause to consider this before and this confusing/frustrating topic is becoming a little clearer. What you are saying is that you can never use a dimension value in Set Analysis. Unless of course there is only one value across the entire dataset.

It explains why this statement does not work, even when just one customer has been selected:

sum({<Customer={"$(Customer)"}>}Sales)

But this one does:

sum({<Customer={"$(=Only(Customer))"}>}Sales)

In my mind I was seeing the Set Analysis and Dollar expansion as one in the same. Is there another way to do a comparison in Set on a field or variable without Dollar Expansion?

I guess the solution for this situation would be to create a field on load to identify if the sale occured within the 8 month 'New Customer' period?

johnw · ‎2011-04-20


NathanFurby wrote:What you are saying is that you can never use a dimension value in Set Analysis. Unless of course there is only one value across the entire dataset.

That is my understanding.


NathanFurby wrote:Is there another way to do a comparison in Set on a field or variable without Dollar Expansion?

Yes. You can do this for example:

sum({<SalesDate=NewCustomerFrom>} Sales)

But what you'll find is that again, the set is only evaluated once for the entire chart. Add Customer as a dimension, and for each customer, it will sum sales where the sales date is ANY of the NewCustomerFrom dates. It doesn't limit itself to that customer's date. Now, maybe I'm misunderstanding what I'm seeing, but that's my understanding of the situation, and the attached example does include on customer C sales from customer B's NewCustomerFrom date.


NathanFurby wrote:I guess the solution for this situation would be to create a field on load to identify if the sale occured within the 8 month 'New Customer' period?

Yes, that's probably what I'd do IF (and only if) performance of the simple sum(if()) was insufficient. I wouldn't assume an 8 month range since we have both a from and till date, so I'd check the date range explicitly. My first thought was an intervalmatch, which would work, but there's no reason for that since there's only a single date range per customer to look at. And instead of a flag, I'd load a SalesNewCustomer field onto the Table_Sales. Then you can just sum(SalesNewCustomer) with no condition or set analysis at all. That should give us maximum chart performance at the cost of some script performance, which is usually a good trade off to make. See attached. Script below.

LEFT JOIN (Table_Sales)
LOAD
Customer
,NewCustomerFrom
,NewCustomerTill
RESIDENT Table_Customer
;
LEFT JOIN (Table_Sales)
LOAD *
,if(SalesDate>=NewCustomerFrom and SalesDate<=NewCustomerTill,Sales) as SalesNewCustomer
RESIDENT Table_Sales
;
DROP FIELDS
NewCustomerFrom
,NewCustomerTill
FROM Table_Sales
;

Edit: Thinking further, I'd probably just leave the NewCustomerFrom and NewCustomerTill fields on the Table_Sales table, and drop the Table_Customer table completely. Nothing wrong with a little denomalization, and it may improve performance here and there.

nathanfurby · ‎2011-04-21

Thanks John. Very useful as always.


John Witherspoon wrote:
Yes. You can do this for example:
sum({<SalesDate=NewCustomerFrom>} Sales)

Just gave this a go and interestingly this statement does work, and on a dimension basis - but only if there has been a selection in the NewCustomerFrom field. When no selection is made then 0 is returned.

johnw · ‎2011-04-26


NathanFurby wrote:Just gave this a go and interestingly this statement does work, and on a dimension basis - but only if there has been a selection in the NewCustomerFrom field. When no selection is made then 0 is returned.

Yeah, I noticed that too. Seems wrong to me, but since the help file fails to describe the behavior, I figure it's working as designed, even if I don't like the design.

From the help file, this is the syntax that allows for this option:

element_set ::= [ field_name ] | { element_list } | element_function

And these look to be the only examples, all of which have a field using its own selections and not values from some other field:

sum( {$<Product = Product + {OurProduct1} - {OurProduct2} >} Sales )
returns the sales for the current selection, but with the product "OurProduct1" added to the list of selected products and "OurProduct2" removed from the list of selected products.

sum( {$<Year = Year + ({"20*",1997} - {2000}) >} Sales )
returns the sales for the current selection but with additional selections in the field "Year": 1997 and all that begin with "20" - however, not 2000. Note that if 2000 is included in the current selection, it will still be included after the modification.

sum( {$<Year = (Year + {"20*",1997}) - {2000} >} Sales )
returns almost the same as above, but here 2000 will be excluded, also if it initially is included in the current selection. The example shows the importance of sometimes using brackets to define an order of precedence.