Set analysis, in stead of If statement - Qlik Community

Report Inappropriate Content · ‎2010-02-08

Hi,

Please look at attached application. There's two tables loaded with no common field name, but I want the fields "CustomerId" and "cid" to be associated by a set analysis statement.

In the attached application, I have made a table chart with an expression with an If statement, and I want to make an expression with a set statement that calculates the same numbers in that table chart.

How do I write that set statement? Is it possible?

Thanks in advance!

Björn

Report Inappropriate Content · ‎2010-02-08

Hi,

I guess that if you use customer in set statement than you can't use customer in dimension. Because when you use set analysis the same expression is applied to all rows.

The second table is made with set statement. So you see Eddie Van Halen has the sum of all four customers, Robert Palmer also has the same sum and so on.

Milda

Anonymous · ‎2010-02-08

You can rename cid as CustomerId. Then the tables will link. Then you only need to sum(amount) in your table.

Report Inappropriate Content · ‎2010-02-08

Thank you jsomsen, but I don't want to do that. I want to make a set statement that calculates the if statement in my example.

Report Inappropriate Content · ‎2010-02-08

Thank you Milda, but that doesn't help me. I need an expression with a set statement that does "the same" as the If-statement-expression in my example. If it's possible...
Björn

Miguel_Angel_Baeyens · ‎2010-02-08

Hello Björn,

Set Analysis allows you to indicate one field with values of other field. This is very slow though and performance is severely affected on more than certain number of records. Anyway, it looks like

Sum({< Field1 = Field2 >} Value)

. Depending on the amount of records involved, I'd keep using IF. Hope this helps.

rwunderlich · ‎2010-02-08

Bjorn,

I don't think it's possible using Set syntax. As John reminded me the other day,
http://community.qlik.com/forums/t/25409.aspx
a set is analyzed once for the entire chart, not for each row.

You can simplfy the island approach by moving the IF into a hidden dimension like:
=if(CustomerId = cid, cid)

Then your expression columns do not not require the IF and may be written simply like:
=sum(amount)

See attached example.

-Rob

johnw · ‎2010-02-08


Rob Wunderlich wrote: You can simplfy the island approach by moving the IF into a hidden dimension like:
=if(CustomerId = cid, cid)
Then your expression columns do not not require the IF and may be written simply like:
=sum(amount)

I hadn't even thought to simplify like that. Good idea.

But it made me curious - what is the relative performance of doing the IF in the dimension vs. doing it in the expression? I honestly didn't expect a difference of significance with one dimension and one expression, figuring that QlikView would have to do the same work regardless. So these results (50000 rows, 1000 distinct values on the matching columns) surprised me:

v8.5 v9.0 SR2
if() in dimension 30547 ms 27719 ms
if() in expression 3407 ms 3375 ms

Using an if() in the expression seems SIGNIFICANTLY faster than doing the exact same thing in the dimension. I'm not sure why that would be. I would have THOUGHT that with one dimension and one column, the performance would be the same. Then I would expect that each additional expression would be "free" with the dimension approach, but not with the expression approach. Testing that part...

v8.5 v9.0 SR2 3 expressions v8.5
if() in dimension 30547 ms 27719 ms 30641 ms
if() in expression 3407 ms 3375 ms 3406 ms

... it turns out that I'm completely wrong again. Apparently, either way, almost all of the time is wrapped up in matching up my fields, and almost no time is actually spent doing the calculations. That makes me think that QlikView is clever enough to realize that the conditions in my expressions are the same, and only do the looping match one time instead of three times. I could be off base, though.

In any case, while I might start with the dimension approach, I'd suggest being willing to switch to the more complicated expression approach if performance seems to be a problem. And even with the expression approach, doing an if() like this is simply going to be slow, even on some fairly small data sets. It makes sense to me here, though. In a sense, this was processing 50000 rows * 1000 distinct values for 50,000,000 combinations to plow through.

Anonymous · ‎2010-02-09

Hello all. I am not sure why this has been bothering me, but I am trying to see if I am missing something. Why would you not want to associate CustomerId and Cid in your script so that these tables were then linked? Then all you would need is a simple sum() in the chart. It seems that placing that If ( ) statement would just add unneeded overhead to the analytic.

This curious mind thanks you in advance,

JS

Report Inappropriate Content · ‎2010-02-10

Hi, thanks for your answer. It's due to work with an app with several tables with huge amounts of records. I'm trying out how to relate the tables in the QV layout without actually relating them in the load script. Just to see if it's possible. Else I have to come up with a DW solution before I load anything into QV.
The performance of the QV app is bad when having these huge tables related, and the load gets really heavy when I use mappings and joins to reduce the amount of tables, even when I do the most of the data preparement in SQL script.

Regards,

Björn