Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
wizardo
Creator III
Creator III

sum(aggr(avg(Invoice_amount),Customer)) OR sum(aggr(max(Invoice_amount),Customer)) which is fatser?

many times when constructing a data model you need to

make a join between Header and Details tables into one new table(i know you can avoid it in some cases but let's assume you cant)

for example:

invoice_header joined with invoice_details (on customer and invoice number) to make a new INVOICES table.

this has the side effect of multiplying the values in each row of the header times the number of corresponding lines in the details.

so if you need to perform some aggregation on a field originating from the header you must deal with this multiplication.

usually by using one of these expressions: (simplified for simplicity 🙂 )

sum(aggr(avg(Invoice_amount),Customer))

OR

sum(aggr(max(Invoice_amount),Customer,invoice))

the question I'm trying to raise for discussion is which is faster?

as I see it,

when using the AVG version qlik need to sum the amount and divide by the number of lines

but when using the max version qlik needs to sort the values to find the max

is it possible there is no difference? 

do you think any is faster, if so why?

or you think this is a redundant unimportant question and you have better things to spend your time on 🙂

Daniel

 

 

1 Reply
marcus_sommer

I wouldn't expect a significantely difference between avg and max but I think it's in general a bad approach because you would need to apply this correction-method to (nearly) all expressions and the use of aggr() are (beside nested if-loops) the biggest performance-killer.

I really doubt that are many usecases in which no other solutions within the datamodel are possible respectively practically. Quite often a header and detail table could be joined without duplicating the records - if there are really n:n relationships such a process might be splitted into several tasks by using DISTINCT loads and if not the joining could be replaced with a mapping.

Beside this often it's not neccessary to merge these tables else just to associate them within the datamodel.

- Marcus