IF vs Pick Match

adamdavi3s · ‎2016-11-22

This is one for stalwar1‌ to pitch in on as we have debated this a couple of times!

I've set up a sample which shows a pseudo-example of the situation we were in when advised by Qlik to use the pick match.

Note, there was a logical reason for the fact and dimensions being set out this way, again on the advice of Qlik.

In this basic example, not much happens until you autogen around 10m rows, even then the objects calculate only 50ms apart, pick(match( being marginally faster.

However crank it up to 100m and the results are looking better for pick match which is roughly 20% faster at 22 seconds vs 27 seconds which is a real visible difference in the front end.

Obviously whilst there is some complexity in this dash, it isn't anywhere near a real world one which is probably why we saw the IF fail but the PICK(MATCH( work in our particular example

My gut says that using either will work just fine, until you start to get into high volume / high complexity / loads of nested IFs

Discuss

m_woolf · ‎2016-11-22

When I code the IF chart like this:

sum(IF(ID='ID1',ID1,

IF(ID='ID2',ID2,

IF(ID='ID3',ID3,

IF(ID='ID4',ID4,

IF(ID='ID5',ID5,0))))))

It seems to recalc faster.

When I code the PICK chart like this:

sum(PICK(MATCH(ID,'ID1','ID2','ID3','ID4','ID5'),ID1,ID2,ID3,ID4,ID5))

It seems to recalc slower.

Go figure.

m_woolf · ‎2016-11-22

Just ran again with a big change for the sum(PICK coding:

sunny_talwar · ‎2016-11-23

adamdavi3s‌‌ -

I have not looked at this as of now, but I def. plan to do some research on this early next week. In the mean time, I would be also be excited to know what johnw‌ thinks about this.

Best,

Sunny

adamdavi3s · ‎2016-11-23

Yep I am going to do some more testing as well, especially having seen the results above

johnw · ‎2016-11-28

I've been off for a week so I'll be playing catch up on the forum and on my job.

The numbers being reported in this thread sounded very strange to me, so I did my own testing. To determine the calc time, I put each chart on a separate tab, and I removed the total from each chart, which only confuses the picture. I saved it on a blank main tab. I closed QlikView completely to clear the buffers, then opened the application, and switched to one of the tabs. I noted the calc time. Then I closed QlikView again, and repeated for the next tab. Here are the results I got with 10,000,000 rows, two runs for each table to try to get the vaguest idea of statistical distribution:

578 563 if(sum())
578 579 pick(match(),sum())
2875 2859 sum(if())
1453 1578 sum(pick(match()))

And that's roughly what I guessed I'd see. I figured two main things would drive performance differences:

the speed of executing a single nested if vs. pick
the number of times they are executed

For the first item, let's take as an example that we're on the row for ID3. For a nested if(), I expect QlikView to work something like this:

Put ID into address 1
Put 'ID1' into address 2
Compare address 1 to address 2
Put ID into address 1
Put 'ID2' into address 2
Compare address 1 to address 2
Put ID into address 1
Put 'ID3' into address 2
Compare address 1 to address 2

For the pick(match()), I expect QlikView to work something like this:

Put ID into address 1
Put 'ID1' into address 2
Compare address 1 to address 2
Put 'ID2' into address 2
Compare address 1 to address 2
Put 'ID3' into address 2
Compare address 1 to address 2

In other words, I expect the pick(match()) to remove the redundant steps of putting the ID into address 1 when it's already there. With a nested if(), I expect QlikView to forget what's in address 1, and so keep reloading it. With the pick(match()), I expect QlikView to remember. So if our "put" and "compare" operations take similar time, I'd expect it to take roughly 2/3 the time of the nested if(), though the overall time wouldn't be quite that good because I'm ignoring the sum().

Now, how often does it execute? That I think depends on where we put the sum(). If we put the sum() on the inside, then I expect us to do the if() or pick(match()) only five times, once for each value of ID. And so while not strictly identical, I wouldn't expect to be able to detect any difference in speed between if(sum()) and pick(match(),sum()).

But if we put the sum() on the outside, then I believe that what's on the inside has to execute independently for every row. So instead of 5 times, we're doing it 10,000,000 times. This dramatically slows down both approaches, and it also highlights the significant difference between nested if() and pick(match()). The only thing that I found surprising was that pick(match()) was even faster in comparison than I would have thought, taking only 1/2 the time instead of over 2/3 the time. But I'd have also been surprised if I'd guessed 100% correctly, as I really don't know the internals of QlikView and how it's actually processing, relative speeds of "put" vs. "compare" and so on. Maybe it handles literals in a way that doesn't require a "put" to do a "compare", say. But even then, it seems like there's something more that I'm missing that makes it even faster than that.

In case it's relevant, I'm still on version 11.

Lech_Miszkiewicz · ‎2019-09-03

It has been 3 ears since you gentlemen have done your tests. I only presume this will behave similar in Qlik Sense now. Given that QlikView runs on the same QIX engine we may as well do the same exact test on latest version of QlikView and should be able to come to conclusion.

Recently I had a vocal argument with one of my colleagues re Pick vs IF and my understanding was similar to what @johnw explained in this post. I am keen on doing little more digging around this just to learn if anything has changed in how Qlik handles those calculations

regards

Lech

cheers Lech, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful to the problem.

Related Topics