Top 10 Clients Per Product based on 130 million un... - Qlik Community

Anonymous · ‎2015-11-19

Hello All,

I hope you are having a good day.

So, the time has come to deal with the large data sets in QlikView (at least for me ). The requirement is to show in a straight table top 10 clients per each product. Since there are 130M product-client unique combinations (although 30 products only) in the data model, I'm dealing with the poor performance. The expression is a simple Sum(Amount) calculation. The server machine the QlikView Server is running on is equipped with 40 CPU cores (4 processors) and 512GB RAM. There is no way to move the calculations into the load script as the functionality has to react on selections.

Would you advise the best method to handle situations like that, please?

Best regards,

Janusz

swuehl · ‎2015-11-19

Have you tried Client as dimension and as expression

=Sum({<Client = {"=Rank(Sum(Amount))<=10"}>} Amount)

?

Anonymous · ‎2015-11-19

Hi swuehl‌

Thanks for the prompt reply.

The expression you suggested returns top clients within the current selections and I'm looking to get top clients per each product. Please, see attached example app.

Many thanks in advance for your help.

Best regards,

Janusz

Anonymous · ‎2015-11-20

Apologies for the direct approach. rwunderlich‌, troyansky‌, BarryHarmsen‌ would you have a minute to help, please?

Many thanks in advance for your help.

Best regards,

Janusz

Oleg_Troyansky · ‎2015-11-24

Hi Janusz,

I think you can find an answer to your question in this blog article. It deals exactly with the situation that you described:

QlikView Blog Q-Tip #5 – Show Top Performers. Common but not too trivial. | Natural Synergies

cheers,

Oleg Troyansky

Check out my new book QlikView Your Business: An expert guide to Business Discovery with QlikView an...

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

Anonymous · ‎2015-11-25

Hello Oleg,

Thanks a mill for getting back to me.

I've read this article before, but after reading this section:

"Notice that this solution cannot be used for the previous problem, where we wanted to show Top 5 Items per Brand. The reason for that is the well-known limitation of Set Analysis – it only gets calculated once per chart, not once per cell, and hence it cannot be sensitive to individual Dimension values. Hence, we can’t use Set Analysis to calculate the Top Items per Brand (i.e. per Dimension value)."

I thought I'm stuck with the Dimension Limits or Expressions Limits solutions which are not performing well.

So, I decided to give it one more chance and I believe I found a solution (please see attached file).

I created an additional field that combines ClientName and ProductName and came up with the following expression:

Sum({<

Client_Product = {"=

       Max(

            Aggr(

                 If (Rank(Sum(Amount),4) <= $(vTopN),

                      1

                 ),

                 ClientName, ProductName

            )

        )

"}

>} Amount

)

It seems to work, but since it is in contrary to what is written in the article, much appreciated if you would test this solution.

Best regards,

Janusz

swuehl · ‎2015-11-25

Janusz,

your solution is not contradicting what is written in the article. You create a set based on a key field Client_Product, where each value is already unique for any combination of your chart dimensions, ClientName and ProductName, by its definition. Thus the evaluation does not need to be sensitive to your individual chart dimensions, in a sense that it needs to be aware of the dimension values when the evaluation occurs.

What you've done is indeed a possible solution.

It's more a question about the performance of this solution. Basically, you transferred the work done in a calculated dimension to the advanced search of the set expression, with potential additional overhead of creating a key field with millions of distinct values in the script and the overhead of a set analysis using such a large key field.

Looking at your sample, the QVW almost doubled the size, the chart with the set expression is lot larger than the chart with the dimension limits and the calculation time seems also quite larger.

Nevertheless, this might depend on the specifics of your sample data to some extent, so it's probably a good idea to test with your real data.

I would assume that this will also show a performance not as good as with the dimension limits, though.

Regards,

Stefan

Top 10 Clients Per Product based on 130 million unique combinations