# Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Employee

# Recipe for a Pareto Analysis

“Which products contribute to the first 80% of our turnover?”

This type of question is common in all types of business intelligence. I say “type of question” since it appears in many different forms: Sometimes it concerns products, but it can just as well concern customers, suppliers or sales people. It can really be any dimension. Further, here the question was about turnover, but it can just as well be number of support cases, or number of defect deliveries, etc. It can in principle be any additive measure.

It is called Pareto analysis. Sometimes also known as 80/20 analysis or ABC analysis.

The logic is that you first sort the products according to size, then accumulate the numbers, and finally calculate the accumulated measure as a percentage of the total. The products contributing to the first 80% are your best products; your “A” products. The next 10% are your “B” products, and the last 10% are your “C” products.

And here’s how you do it in QlikView:

1. Create a pivot table and choose your dimension and your basic measure. In my example, I use Product and Sum(Sales).

2. Sort the chart descending by using the measure Sum(Sales) as sort expression. It is not enough just to check “Sort by Y-value”.

3. Add a second expression to calculate the accumulated sales value:
RangeSum(Above(Sum(Sales), 0, RowNo()))
Call this expression Accumulated Sales. The Above() function will return an array of values – all above values in the chart – and the RangeSum() function will sum these numbers.

4. Create a third expression from the previous one; one that calculates the accumulated sales in percent:
RangeSum(Above(Sum(Sales), 0, RowNo())) / Sum(total Sales)
Format it as a percentage and call it Inclusive Percentage.

5. Create a fourth expression from the previous one; one that calculates the accumulated sales in percent, but this time excluding the current row:
RangeSum(Above(Sum(Sales), 1, RowNo())) / Sum(total Sales)
Format it as a percentage and call it Exclusive Percentage.

6. Create a fifth expression for the ABC classification:
If([Exclusive Percentage] <= 0.8, 'A', If([Exclusive Percentage] <= 0.9, 'B', 'C'))
Call this expression Pareto Class. The reason why the Exclusive Percentage is used, is that the classification should be determined by the lower bound of a product’s segment, not the upper.

7. Create a conditional background color, e.g.
If([Pareto Class] = 'C', LightRed(), If([Pareto Class] = 'B', Yellow()))

You should now have a table similar to the following. In it you can clearly see the classification of different products.

In this table, there are five different expressions that you can use for Pareto analysis. The graph in the beginning of this post uses Sales and Inclusive Percentage for the bars and the line, respectively; and Pareto Class for the coloring of the bars.

Further, you may want to combine the Pareto Class and the Exclusive Percentage into one expression:

Pareto Class =
If(RangeSum(Above(Sum(Sales),1,RowNo())) / Sum(total Sales) <= 0.8, 'A',
If(RangeSum(Above(Sum(Sales),1,RowNo())) / Sum(total Sales) <= 0.9, 'B', 'C'))

Good luck in creating your Pareto chart.

HIC

Further reading related to this topic:

Recipe for an ABC Analysis

Recipe for a Pareto Analysis – Revisited

Employee

With the above method, it is unfortunately not possible to show subtotals or to use the ABC-classification as a dimension.

However, if you use a different definition of the ABC-classification, it's possible. E.g. use the following as calculated dimension:

=If(Aggr(Rank(total Sum(Sales))< 0.4*Count(distinct total ProductName),ProductName),'A',

If(Aggr(Rank(total Sum(Sales))< 0.6*Count(distinct total ProductName),ProductName),'B',

'C'))

I.e. if the product ranks within the best 40% of the products (responsible for most of the turnover), then the product is an 'A' product. And if it is between 40% and 60%, it is a 'B', etc.

With this definition you can use subtotals.

HIC

140 Views
Not applicable

Thanks for the tip, Henric! I will try your suggestion.

I'm currently using a workaround: define three variables and displays them in the input fields(readonly). To recalculate the variables using a macros that is called from the triggers. Сertainly not the best solution.

Yeah, I remember your recommendations "Macros are Bad, but Triggers are Worse", but ...

140 Views
Not applicable

Hi guys,

I was able to create a chart table according to HIC example but i fail to create a combo chart. I can't get reference line to be shown and the right axes shows me 120%. What I'm doing wrong?

140 Views
Employee

For the reference line, I use the following on the Presentation tab:

and for the static (100%) y-axis, I use the following on the Axes tab:

HIC

140 Views
Not applicable

Thank you. Just i see that Inclusive percentage might trick the users because sometimes the inclusive percentage point is above the reference line which is 80% so users might get confused why something that is above the 80 percent reference still get an A class

140 Views
Partner

Hi Henric,

I was able to show a Pareto Graph for TOP 15 customers and 3 Reference Lines to show Top 5, Top 10 and Top 15 contribution, like this below expressions and graph:

Top 15 Expression:

Sum({<[Cust Name] = {"=Rank(sum(TotalRevenue ))<=15"}>} TotalRevenue )

Accumulated % Expression:

RangeSum(Above(sum({<[Cust Name] = {"=Rank(sum(TotalRevenue ))<=15"}>} TotalRevenue ) , 0, RowNo()))

/

sum(TOTAL TotalRevenue )

Reference Line Expression:

Sum(Aggr(If(Rank(Sum (TotalRevenue))<=5,Sum (TotalRevenue)),[Cust Name])) / Sum (TotalRevenue)

Now my requirement is :Can it be possible to make the Reference Lines Dynamic somehow (Top N)? I mean to say, the User should have the liberty to enter the value of 'N' somewhere and the reference line should display the Top N value? According to the input provided by User the graph would become dynamic.

Could you please suggest some ideas to implement or have any other ideas for better functionality?

Regards!

140 Views
Not applicable

Hi HIC,

I have another tricky question to ask. How can i see different Pareto class records on  different tables. Like i want to see A class in table1, B class in table2 and C class in table3 at once without making any selections and ofcourse i want to see the original table with all pareto class clasifications. Can this be done? I trried many ways but i fail

140 Views
Contributor II

Henric, Thank you for putting this all together.  I am not seeing different results than I would expect.

First Question: I am trying to identify the top 20% and bottom 20% performers but both Victoria and Leslie have the same amount of Saves as those in C and display in the B vs. C.  I would think they would be in the bottom 20% with Dreai, Desiree and William.

Could you help me understand this.

A Second question:  Is it possible to pull the the minimum of each bucket and place in a variable for use elsewhere?  I need to use the two values for further scoring in the application.

Examples are below.

AgentName     Saves     Exclusive %     Class

Bob                    22          0.00%               A

Sarah                 19          12%                  A

Craig                  16          37%                  B

Tom                    14          52%                 B

Lynn                    9           78%                 B

George                3           83%                 C

Sherrie                1           96%                 C

variable \$(vSavesA) = 19         as it's the minimum value above 20%

variable \$(vSavesB) = 9           as it's the minimum value above 80%

Thank you.

140 Views
Employee

Q1: While Victoria, Leslie, Dreai, Desiree and William all have one save only, and thus share the last position, you cannot say that they all belong to the bottom 20%. Together, they are responsible for the range 70.59% - 100%, and someone is responsible for the 71st percent . The internal order is however arbitrary, so we can't tell who has done what. But QlikView will order them, and assign some to group B and some to group C.

It is just as valid to say that they are all responsible for the 71st percent, and thus none of them belong to the worst 20%, as it is to say that they all are in the worst 20%.

If you want the same classification for the entire group, you should use Rank.

Rank(Sum(Saves),1) will assign rank 4 to all of them.

Rank(Sum(Saves),3) will assign rank 8 to all of them.

Q2: No. Not possible. (But it is possible if you instead of Pareto use Rank.)

HIC

140 Views
Not applicable

Hi thanks for this! I have created the initial Pivot successfully.  But I am having difficulty getting the ABC colours represented in the bars of the combo chart.  Can you show an example screenshot of the expression for the background colour please?

Thank you.

140 Views
Employee

For the [Inclusive Percentage] I use

RangeSum(Above(Sum(Sales), 0, RowNo())) / Sum(total Sales)

and for the Colors I use

If([Inclusive Percentage] <= 0.8, LightGreen(), If([Inclusive Percentage] <= 0.9, Yellow(), LightRed()))

HIC

140 Views
Not applicable

Thats perfect ! many thanks !!!!

140 Views
Not applicable

Hi Henric,

Is the process of creating a Pareto chart in Qlik Sense the same as the one for QlikView? If not, can u please help me with a step by step Process for Creating a Pareto Chart in Qlik Sense?

140 Views
Employee

You can use exactly the same instruction. It will work.

HIC

140 Views
New Contributor III

A master Chef Recipe , very good article

140 Views
Not applicable
 Henric Cronström schreef: With the above method, it is unfortunately not possible to show subtotals or to use the ABC-classification as a dimension. However, if you use a different definition of the ABC-classification, it's possible. E.g. use the following as calculated dimension:      =If(Aggr(Rank(total Sum(Sales))< 0.4*Count(distinct total ProductName),ProductName),'A',        If(Aggr(Rank(total Sum(Sales))< 0.6*Count(distinct total ProductName),ProductName),'B',        'C')) I.e. if the product ranks within the best 40% of the products (responsible for most of the turnover), then the product is an 'A' product. And if it is between 40% and 60%, it is a 'B', etc. With this definition you can use subtotals. HIC

I know you state that it's a different definition but this is a false formula

Counting the ProductNames is not what ABC is about. You can't says that if there are 100 products, 40 of them should have class A as it says nothing about their gross margin / turnover.

140 Views
Employee

What you point out is semantics.

As I see it, a Pareto analysis must consider the turnover: It should show the products that are responsible for the first 80% of the turnover. But an ABC analysis - which is a wider concept - can have a different, more general definition. For example, the best ranked products.

In any case, it is up to you to choose which definition to use. If you choose the rank, and not the turnover, the above formula is correct.

HIC

140 Views
Not applicable

Ok you are right. Rank does not account for skewed data which actually is the whole idea of doing a Pareto 80-20.

As I understand your formula only works for simple ranks.

140 Views
Not applicable

Thanks, useful post.

Is there any way For individually counting for A,B and C?

140 Views
MVP

Informative.Thank you

140 Views
Not applicable

Hi,

I seem to be having an issue with my Percentages.  My Total and Accumulated columns are fine however the Inclusive and Exclusive percentages look way out.....  :

My Inclusive Percentage expression is :

RangeSum(Above(sum({ <[Indicator Number] = {313}>} Number), 0, RowNo())) / sum( { <[Indicator Number] = {313}>} [Line Total])

My Exclusive Percentage expression is :

RangeSum(Above(sum({ <[Indicator Number] = {313}>} Number), 1, RowNo())) /  sum( { <[Indicator Number] = {313}>} [Line Total])

Any help appreciated.

Tom

140 Views
Not applicable

Hi hic

As usual it was another informative article from you. I have a question, maybe i haven't really understood the concept correctly and will appreciate some insights from someone of your calibre.

When we talk of the Pareto Analysis or to say 80:20 rule we are trying to derive something like 20% of the Customers generate 80%(or maybe 75 - 90%) Sales. In the example above and most of the examples ive seen shows which Customers account for 80% of the Sales but it never takes into account the percentage those Customers account for ie it shows the 80% side of the concept but takes for granted the 20% part of the concept. Is this how this is supposed to be or am i missing something. Though it is likely to be the latter, i would appreciate your feedback/explanation on this.

Warm Regards

140 Views
Employee

To say that "20% of the Customers generate 80% of the Sales" is just a rule of thumb. In other words - a gross over-simplification. But it could also be a hypothesis that you want to test.

The above analysis gives you a tool to test this hypothesis. The method really answers the question "Which customers contribute to the first 80% of the sales?" The answer is different in different data sets. In your data, it could be 10% of the customers. Or it could be 40%. In any case, it is a tool that will help you understand your own data and help you prioritize your customers.

HIC

140 Views
Valued Contributor II

Hi hic

I've just created a Pareto chart following this instructions, it works fine without problems; thanks for sharing it.

I have Product as dimension, sum(sales) as bar-expression and ABC as curved line expression

Now I would like to ask you a little variant, how can I paint n Pareto Curves one for each company ?

Thanks,

Joaquín

140 Views
Employee

You can't.

The Product sort order is different for different companies, so it wouldn't make sense to show several companies in a pareto chart.

HIC

140 Views
Not applicable

Currently I am not able to respond to your email. I will be back in the office on Wednesday the 11th ofFebruary.

With Kind regards, Mit freundlichen Grüßen, Met vriendelijke groet,

COEN RISPENS

c.rispens@essenzahome.nl <mailto:m.bree@essenzahome.nl>

Industrieweg Noord 12

3958 VT Amerongen

The Netherlands

Info@essenzahome.nl <mailto:Info@essenzahome.nl>

ESSENZAHOME.NL

140 Views
Valued Contributor II
140 Views
New Contributor III

Hello Mr. hic,

I need to build a more complex Pareto Analysis.

Let's say I already built 2 pareto charts, the first related to Previous Year and the second related to Current Year. I want to analyze now class status changes to build a new report listing all class changes occurred in the form: <Previous Year Class->Current Year Class>  i.e. A->A, A->B, B->A, etc.

For example:

CUSTOMERPREVIOUSCURRENT
CHANGE
Customer 01ABA->B
Customer 02BCB->C
Customer 03BAB->A

The only idea I have is to export the two pareto charts in .qvd format, and then re-import them again, building at script level a single new table.

But this requires macro development.

Could you suggest a different solution to this problem?

Thank you!

Emilio.

140 Views
Employee

It sounds as if you don't need a dynamic classification, i.e. you don't want the classification to change with your selections. If so, you could use the above logic, but in the script instead of in a chart. And then you could do it for Customers and per year.

Do you have some sample data? This could make a good blog post...

HIC

140 Views
Contributor

Very Nice!

HIC, do you have other examples that explains different kind of Analysis/Charts like this one? As a BI Developer, I always strive to choose the best visual for the available data.

Thank you,

140 Views