Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
vgutkovsky
Master II
Master II

performance with valuelist

I've run into a tricky problem--hopefully someone here has seen this and can save me a huge headache. I have a pivot table with 2 dimensions and 5 expressions. I added a 3rd valuelist dimension (3 members) to break up my expressions into subcolumns. Everything is calculating correctly, but performance took a huge hit; pre-valuelist the pivot table was calculating in about 20 seconds, and post-valuelist it takes close to 15 minutes! I tried replacing the valuelist with an expression island, but performance remained as bad. I would guess that the expressions are trying to calculate over the valuelist (even though all I'm using it for is to check which column we're in). So, my general syntax for the expressions is:

if(valuelist('A','B','C')='A',
do X,
if(valuelist('A','B','C')='B',
do Y,
do Z
)
)

I didn't try aggring out of the valuelist, because that would cause other problems (it's a little complicated to explain). Thoughts?

Thanks,

6 Replies
johnw
Champion III
Champion III

I don't know why there would be a performance problem. I would probably go with the island table to avoid any calculated dimension weirdness, and do a pick(match()) so that we only need to check the value once instead of twice, but I don't see any reason that either of those changes would improve performance in any significant way.

pick(match(IslandValue,'A','B','C')
,expression X
,expression Y
,expression Z)

vgutkovsky
Master II
Master II
Author

Thanks, John. I switched it to the pick(match()) but as you predicted, it didn't make much of a difference. LIke I mentioned, I think the expressions are trying to calculate over an island dimension--this might be causing a Cartesian join situation, and we're dealing with quite a bit of data here. Do you know of any way (other than aggr) to tell an expression to ignore a field in its calculations? Maybe something with set analysis, but I've never seen this done before. Not just ignore user selections, but ignore the field itself...

Thanks,

johnw
Champion III
Champion III

How many values do you have in the actual application? It seems like if you have N values, even if it's doing a cartesian join, it should only take N times as long to run. 20 seconds to 15 minutes would imply 45 different values, which seems unlikely?

A cartesian join seems unlikely anyway. The expression seems clear.

However... I have seen something like this before, now that I'm thinking more about it. When Gordon Savage worked out how to do set analysis that was sensitive to the dimension values of a chart, at first I had trouble replicating his observed high performance. My problem turned out to be that I'd changed a feature of his solution, making it look a lot more in structure like what I just posted with the pick(match()).

That broke the performance, perhaps just like your performance is broken. The solution was to add back in another feature of his solution - variables that held the expressions. That done, the performance problems went away.

I can't explain WHY it had performance problems in the first place, or WHY using variables made them go away, though it made a vague kind of sense in my specific example. Regardless, perhaps it gives us a solution to your problem: try replacing all of the subexpressions with variables that store the subexpressions.

pick(match(IslandValue,'A','B','C')
,vExpressionX
,vExpressionY
,vExpressionZ)

For that matter, in my example, even the full expression is yet another variable. I don't think that's necessary, but I'm not certain.

Anyway. Might work. Might not. Worth a try.

vgutkovsky
Master II
Master II
Author

Thanks for the suggestion, John. QV will never cease to surprise me 🙂 I just implemented this and while it helped performance a tiny bit, I didn't really improve it significantly. I'm totally stumped by this one, got scratch my head on this for a few days. I really appreciate your input!

Kartik_semwal_99
Partner - Contributor
Partner - Contributor

Hi vgutkovsky, I'm also stuck in the same issue. Have you got any solution for this ?

marcus_sommer

Don't use such approaches else transfer the essential logic into the data-model.

Creating the virtual table behind the object with the needed dimensionality is the heaviest part of each calculation and AFAIK til today single-threaded. The multi-threaded sum() and count() aggregations could be "nearly" negliated in regard to the overall run-time. Each kind of extra dimensionality will add a rather exponential performance impact - especially if it are synthetic ones which are causing some cartesian effects.

If the response time is already rather slow with about 20 seconds like mentioned in origin post each kind of synthetic dimensions must slow down the performance further significantly.