Using set analysis with if and aggregation

Pasha1 · ‎2021-04-27

Hi guys,

I have a pretty slow calculations in a table. And I need to improve performance here. My thought is that using set analysis instead of if can speed it up.

The table data is an object ID usage hours by date: [Object ID] | [Run Hours] | [Date]. I need to calculate the % of object usage over time but it's also possible to set a threshold for run hours.

Current calculation looks like this:

(

Count(
distinct if(
aggr(sum([Run Hours]),
[Object ID],
Date
) > vMinRunHr,
Date)

)

) / count(distinct Date)

I tried something like this, but it doesn't work:

(

count(

{$<[Object ID] =

{"=aggr(sum([Run Hours]), [Object ID], Date) > vMinRunHr"}>} distinct Date

)

) / count(distinct Date)

Am I doing something wrong and are there any other ways to make this calculation work faster when there are a lot of data?

Thanks

Digvijay_Singh · ‎2021-04-27

Can you share some sample data to try few options? whats the table dimensions?

marcus_sommer · ‎2021-04-28

How fast or slow a calculation is performed depends (beside the available hardware) mainly on the used datamodel and how well it's designed respectively how suitable it is for wanted views. Especially in regard to optimize the UI performance it's recommended to develop the datamodel in the direction of a star-scheme or by really huge datasets even as a big flat table.

The main-aim behind it is to ensure that all related fields for the most common respectively the most heavy calculations come from a single-table to shortcut the Qlik efforts to create the needed virtual tables which define the dimensionally context for each single expression.

Beside this you should check if you really need the aggr() at all - means: is they logic mandatory for the calculation and if could they be replaced with another dimensionality in the object or be replaced with some pre-calculations in the script? Further helpful in regard to performance is to move the condition from the inside to the outside of the aggregation, means:

if(condition, expression())

is faster as:

expression(if(condition))

Another approach could be to replace and/or to extend the calculation with some Boolean logic, like:

sum(aggr(-(sum([Run Hours]) > vMinRunHr), [Object ID], Date))

- Marcus

Pasha1 · ‎2021-04-28

@Digvijay_Singh I added the qvf with sample data to the post. Duplicating data load here:

Activity:
LOAD * Inline [
SysytemSk , Run Hours , DateSk
1, 0.4, 11,
2, 0, 11,
3, 0, 11,
1, 0.6, 22,
2, 1.2, 22,
3, 0, 22,
1, 0, 33,
2, 0.8, 33,
3, 1, 33,
1, 0.4, 44,
2, 0.1, 44,
3, 0, 44,
1, 0, 55,
2, 0.1, 55,
3, 0, 55,
1, 0, 66,
2, 0, 66,
3, 0, 66,
];

Dates:
LOAD * Inline [
DateSk , Date
11, 1/2/2021,
22, 2/2/2021,
33, 3/2/2021,
44, 4/2/2021,
55, 5/2/2021,
66, 6/2/2021,
];

Systems:
LOAD * Inline [
SysytemSk , Object ID
1, Obj 1,
2, Obj 2,
3, Obj 3
];

Pasha1 · ‎2021-04-28

@marcus_sommer thank you for this detailed comment! I have some ideas to try now

AGGR in set analysis