Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
adriano_stefane
Partner - Contributor
Partner - Contributor

Complex expression doesn't allow to open qvw with many data

Dear QV experts,

I've a *.qvw document with a line-chart, where we've 3 dimension fields and 1 complex expression, like:

Dimensions

  • contentid (Enable Conditional =0)
  • consumption (Enable Conditional =0)
  • =Aggr(num(Rank(Sum({<contentid=>} consumption),4)), contentid)

Expression

  • =2*((Rangesum(below(TOTAL (Aggr(num(Rank(Sum({<contentId=>} consumption),4)),contentid)*(sum(consumption)/sum(TOTAL consumption))),0,max(TOTAL(Aggr(num(Rank(Sum({<contentid=>} consumption),4)),contentid))))))*(Rangesum(below(TOTAL (1/(Rangesum(below(TOTAL sum(consumption/sum(TOTAL consumption)),0,max(TOTAL (Aggr(num(Rank(Sum({<contentid=>} consumption),4)),contentid))))))),0,max(TOTAL(Aggr(num(Rank(Sum({<contentid=>} consumption),4)),contentid)))))))-1

The expression calculates the effective catalog size (ECS). It is a metric that describes how spread viewing is across the items in our catalog. If most viewing comes from a single video, it will be close to 1. If all videos generate the same amount of viewing, it is close to the number of videos in the catalog. Otherwise it is somewhere in between.


In test environment, *.qvw document working fine, refresh takes 30sec, app size is 300KB and contains 7.000 rows. Into production environment when open the document, appear an error popup with message: "there was a problem sending the command to the program". Here, refresh takes 2min, app size is 20MB and contains 7.000.000 rows.


Probably expression is much complex to be calculate on the fly, but unfortunately can't be calculated a backend, because data depends by selection on Calendar fields. As default, expression is calculated for a single day.

Additional info:

  • i'm using QlikView 12.10 SR7
  • I can't share my *.qvw document
  • HW specs, for Test 4 processors * 2Ghz and 64GB RAM, for Prod 16 processors * 2Ghz and 128GB RAM
  • Attached, a sample data

Thanks in advance.

3 Replies
marcus_sommer

I'm not surprised that this is quite slow to uncalculable by so many aggr and TOTAL within the interrecord-function below. I have the feeling that not all of the aggr, below and TOTAL are necessary and it could be simplified in some way.

I suggest that you make a re-start with this task beginning with rather simple sum/count expressions and if they return the expected results and needs to consolidated on step more if there are simpler solutions as just using aggr - especially by including some script pre-calculations.

- Marcus

adriano_stefane
Partner - Contributor
Partner - Contributor
Author

Hi Marcus,

thanks for your feedback, I've done as you said and below the best way of my review:

=2*((Rangesum(below((Aggr(num(Rank(Sum(consumption),4)),contentid)*(sum(consumption)/sum(TOTAL consumption))),0,max(TOTAL(Aggr(num(Rank(Sum(consumption),4)),contentid))))))*(Rangesum(below((1/(Rangesum(below(TOTAL sum(consumption/sum(TOTAL consumption)),0,max(TOTAL (Aggr(num(Rank(Sum(consumption),4)),contentid))))))),0,max(TOTAL(Aggr(num(Rank(Sum(consumption),4)),contentid)))))))-1

but nothing is changed. I've also splitted the entire expression into some variables declared on Variables Overview like this:

MaxRank =max(TOTAL(aggr(num(rank(sum(consumption),4)),contentid)))

k1 =aggr(num(rank(sum(consumption),4)),contentid)*(sum(consumption)/sum(TOTAL consumption))

k2 =1/(rangesum(below(TOTAL sum(consumption/sum(TOTAL consumption)),0,max(TOTAL (aggr(num(rank(sum(consumption),4)),contentid))))))

ECS =2*((rangesum(below($(k1),0,$(MaxRank))))*(rangesum(below($(k2),0,$(MaxRank)))))-1

but nothing is changed.

If you are interested, I've attached an excel file with calculation and for your reference you can find formula at the following link Appendix section..

Many thanks

marcus_sommer

To separate expressions or parts of it in variables is definitely useful to improve the readability and maintainability but it won't increase the performance (unless it leads to some precalculation and not an expression is returned else a value) because the executed expression itself didn't change.

I meant to check if you really need all the aggr-functions, see:

When should the Aggr() function NOT be used?

Pitfalls of the Aggr function

- Marcus