Re: Set Analysis performance - Qlik Community

Vitali · ‎2019-08-14

Hello,

Can anyone suggest if there would be, over a larger dataset, a difference in the performance between these two sample expressions:

1) Count({1<SalesOffice-={'Lund'}, JobTitle={"Systems Manager"}>}DISTINCT EmployeeID)

2) Count({1<JobTitle={"Systems Manager"}>-<SalesOffice={'Lund'}>}DISTINCT EmployeeID)

Any resources you can recommend about calculation performance would be welcome too.

Thank you in advance!

Regards,

Vitali Burla

tresB · ‎2019-08-14

I haven't tested it, neither I know what algorithm qlik engine exactly follows at the back-end. I would only say that if there is any chance of differences between them, it could be negligible and the faster (if at all) should be the first one.

Now let me explain why: the first expression filters are of AND nature, i.e. - SalesOffice exclusion and JobTitle inclusion have to be true for the same record (because they are part of same set element - separated by comma). Whereas, in second expression the filters are independent. That means, for first expression there is a scope of filtering one data set and then implying the second one on that limited data set - so search time becomes a little less here. In second expression the two filters are independent hence the filters would be applied separately on the entire data set, taking longer time.

Vitali · ‎2019-08-14

These are my thoughts exactly, I'm just looking for confirmation (hopefully! 🙂 )

marcus_sommer · ‎2019-08-14

Like Tresesco I don't know how it's internally processed and I agree completely with his first paragraph. But by the second paragraph I believe the way of working is rather reversed.

AFAIK a set analysis is (nearly) the same like a selection. In the first example I think both conditions will be executed in parallel (I assume it in multi-threading) returning TRUE or FALSE for the values in the appropriate fields respectively the system-tables. Afterwards the engine builds the scope respectively a virtual table on the which the real aggregation is applied. In the second example the conditions are chained and they might be executed one after another (in this simple case it might not be needed but in general are more complex and even nested chains possible which may require an additionally evaluating).

Beside this by larger datasets it might worth to test if one or maybe several flags within the script might be improve the UI performance, for example with:

Count({1< EmployeeFlag ={1}>} DISTINCT EmployeeID) or Count({1} DISTINCT EmployeeID) * EmployeeFlag

And of course will the kind of the datamodel have an impact on the performance of the UI calculations.

- Marcus

Vitali · ‎2019-08-15

Hello Markus,

Thank you very much for your input - I very much appreciate it. Well, I guess I'll just have to try this on a larger dataset, once I come across one.

Thanks once again.

Regards,

Vitali

Set Analysis performance

General Question

Performance

Script and Expressions