Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
Pasha1
Contributor
Contributor

Using set analysis with if and aggregation

Hi guys,

I have a pretty slow calculations in a table. And I need to improve performance here. My thought is that using set analysis instead of if can speed it up.

The table data is an object ID usage hours by date: [Object ID] | [Run Hours] | [Date]. I need to calculate the % of object usage over time but it's also possible to set a threshold for run hours.

Current calculation looks like this:

(

  Count(
    distinct if(
      aggr(sum([Run Hours]),
           [Object ID],
           Date
      ) > vMinRunHr,
    Date)

  )

) / count(distinct Date)

 

I tried something like this, but it doesn't work:

 

(

  count(

    {$<[Object ID] =

      {"=aggr(sum([Run Hours]), [Object ID], Date) > vMinRunHr"}>} distinct Date

  )

) / count(distinct Date)

 

 

Am I doing something wrong and are there any other ways to make this calculation work faster when there are a lot of data?

Thanks

 

 

 

Labels (1)
4 Replies
Digvijay_Singh

Can you share some sample data to try few options?  whats the  table dimensions?

marcus_sommer

How fast or slow a calculation is performed depends (beside the available hardware) mainly on the used datamodel and how well it's designed respectively how suitable it is for wanted views. Especially in regard to optimize the UI performance it's recommended to develop the datamodel in the direction of a star-scheme or by really huge datasets even as a big flat table.

The main-aim behind it is to ensure that all related fields for the most common respectively the most heavy calculations come from a single-table to shortcut the Qlik efforts to create the needed virtual tables which define the dimensionally context for each single expression. 

Beside this you should check if you really need the aggr() at all - means: is they logic mandatory for the calculation and if could they be replaced with another dimensionality in the object or be replaced with some pre-calculations in the script? Further helpful in regard to performance is to move the condition from the inside to the outside of the aggregation, means:

if(condition, expression())

is faster as:

expression(if(condition))

Another approach could be to replace and/or to extend the calculation with some Boolean logic, like:

sum(aggr(-(sum([Run Hours]) > vMinRunHr), [Object ID], Date))

- Marcus

Pasha1
Contributor
Contributor
Author

@Digvijay_Singh  I added the qvf with sample data to the post. Duplicating data load here:

Activity:
LOAD * Inline [
SysytemSk , Run Hours , DateSk
1, 0.4, 11,
2, 0, 11,
3, 0, 11,
1, 0.6, 22,
2, 1.2, 22,
3, 0, 22,
1, 0, 33,
2, 0.8, 33,
3, 1, 33,
1, 0.4, 44,
2, 0.1, 44,
3, 0, 44,
1, 0, 55,
2, 0.1, 55,
3, 0, 55,
1, 0, 66,
2, 0, 66,
3, 0, 66,
];

Dates:
LOAD * Inline [
DateSk , Date
11, 1/2/2021,
22, 2/2/2021,
33, 3/2/2021,
44, 4/2/2021,
55, 5/2/2021,
66, 6/2/2021,
];


Systems:
LOAD * Inline [
SysytemSk , Object ID
1, Obj 1,
2, Obj 2,
3, Obj 3
];

Pasha1
Contributor
Contributor
Author

@marcus_sommer  thank you for this detailed comment! I have some ideas to try now