Re: help - performance improvements - Qlik Community

Report Inappropriate Content · ‎2014-05-24

if i have a line of code like this:

if(a=b,1,

if(c=f,2,

if(c=g,3,4)))

my ?: if first condition is true (i.e.a=b) does the engine (parser) still waste it's time and slow stuff down by checking the remaining condiions?

rwunderlich · ‎2014-05-24

What you are describing is commonly called "short circuiting". I believe the answer is "yes". When the condition is true, the remaining Else conditions are not evaluated. I'm not sure of this though. Perhaps someone from QT development can weigh in on this question. Maybe Henric Cronström?

-Rob

hic · ‎2014-05-26

The else part of the expression is indeed evaluated, also when not needed. You can test this yourself by using an input box with a variable, and a pivot table with a (heavy) expression that uses the variable, e.g.

If( vTestVariable = 1, 'simple...', Count( distinct FieldWithManyRecords ) )

Note how long time it takes to calculate this when vTestVariable=0, then change the value of the variable to 1. It takes the same amount to calculate.

So, there is still room for optimization...

HIC

rwunderlich · ‎2014-05-26

This is very interesting. Would a pick(match()) evaluate any less? For example:

pick(match('1', '1','2'),'simple',Count( distinct FieldWithManyRecords ) )

In this expression, would the count(...) be evaluated as well?

-Rob

Report Inappropriate Content · ‎2014-05-26

thanks Rob, will sleep on it

hic · ‎2014-05-26

Yes, the Count(...) is evaluated as well, also in your expression. Just tested it...

HIC

rwunderlich · ‎2014-05-27

Thanks for your input Henric. As always, I'm grateful for your insight and transparency.

-Rob

magavi_framsteg · ‎2014-05-27

Hi Rob and Henric.

I did not believe you at first as my colleague showed me your post.

My first question was "Could QT really have broken the standard of conditional evaluation?"

Yes they can!

I suspect it has to do with UI responsiveness, to cache all expressions for a better user experience.

But alas, I hate it when application vendors try and help bad programmers and exchange common standardised praxis functionality for ease of use........

It is not the behaviour a programmer would expect.

Normally, conditions exit after a match, but not in this case.

It seems as it evaluates expressions for all conditions no matter what.

I tried to replicate the same behaviour with conditional expressions, and gladly it works as one would expect;

The expression is not evaluated if the pre-condition fails.

Do not confuse with conditions in expressions.

Condition in expression:

if (vVar = 0, sum(iCounter),
if (vVar = 1, count(distinct SSN),
count(distinct SSN) + count(distinct %KEY_SSN_YearMonth)
)
)

//Regards

Magnus Åvitsland

Framsteg.com

Stockholm, Sweden

hic · ‎2014-05-27

I agree that at first glance one would think that the optimal behaviour must be not to evaluate remaining conditions. But the question is a lot more complicated than that...

The algorithm to calculate an expression is extremely complex. Say, for instance that you have a chart with multiple dimensions: Then the expression should be evaluated for each combination of the dimensional field values, i.e. the Cartesian product of the constituent fields. And this in an arbitrary data model.

Further, the argument of the aggregation function could involve fields from different tables, e.g. Sum(A*B) where A and B sit in two tables far from each other. The aggregation then needs to take place in an virtual ntuple created from the Cartesian product of A and B, where the argument of the aggregation function is to be evaluated once per row. But the expression is not parsed for every row - instead (for performance reasons) the expression is converted to assembler code and executed for each row.

So, in the general case, a chart involves a double Cartesian product using an arbitrary number of fields in both levels. It is like having a SELECT statement with an arbitrary number of fields in the argument of the aggregation function and an arbitrary number of GROUP BY fields, but without having direct information about the JOINs...

And then we need to add the possibility of any number of scalar functions at any level of the expression; e.g. any number of nested if()-functions. Needless to say, the algorithm is quite complex, and when it was implemented, we just didn't manage to short circuit the evaluation. And I am still today not sure that it would be possible to combine short-circuiting with the assembler code.

HIC

Report Inappropriate Content · ‎2014-05-28

Hi Clive, I had a similar issue, but in that my if statement was huge, and I wanted to simplify it (rather than optimise the load)

If you use a combination of a lookup table and the alt() function, which takes the first non-null value of a list of values, you can get a similar result.

Lookup:

Load * Inline [

a	c	aRes	cRes
b	f	1	2
b	g	1	3

];

left join (Main_Data) Load * resident Lookup;

drop table Lookup;

left join (Main_Data)

Load

SingleKeyField,

alt(aRes,cRes,4) as resultField

resident Main_Data

;

Here I've assumed that if there are no alternatives to a and c except null. This would yield the same result as your if statement. I have no idea on the impact on performance though.

Erica