“Which products contribute to the first 80% of our turnover?”
This type of question is common in all types of business intelligence. I say “type of question” since it appears in many different forms: Sometimes it concerns products, but it can just as well concern customers, suppliers or sales people. It can really be any dimension. Further, here the question was about turnover, but it can just as well be number of support cases, or number of defect deliveries, etc. It can in principle be any additive measure.
It is called Pareto analysis. Sometimes also known as 80/20 analysis or ABC analysis.
The logic is that you first sort the products according to size, then accumulate the numbers, and finally calculate the accumulated measure as a percentage of the total. The products contributing to the first 80% are your best products; your “A” products. The next 10% are your “B” products, and the last 10% are your “C” products.
And here’s how you do it in QlikView:
Create a pivot table and choose your dimension and your basic measure. In my example, I use Product and Sum(Sales).
Sort the chart descending by using the measure Sum(Sales) as sort expression. It is not enough just to check “Sort by Y-value”.
Add a second expression to calculate the accumulated sales value: RangeSum(Above(Sum(Sales), 0, RowNo())) Call this expression Accumulated Sales. The Above() function will return an array of values – all above values in the chart – and the RangeSum() function will sum these numbers.
Create a third expression from the previous one; one that calculates the accumulated sales in percent: RangeSum(Above(Sum(Sales), 0, RowNo())) / Sum(totalSales) Format it as a percentage and call it Inclusive Percentage.
Create a fourth expression from the previous one; one that calculates the accumulated sales in percent, but this time excluding the current row: RangeSum(Above(Sum(Sales), 1, RowNo())) / Sum(totalSales) Format it as a percentage and call it Exclusive Percentage.
Create a fifth expression for the ABC classification: If([Exclusive Percentage] <= 0.8, 'A', If([Exclusive Percentage] <= 0.9, 'B', 'C')) Call this expression Pareto Class. The reason why the Exclusive Percentage is used, is that the classification should be determined by the lower bound of a product’s segment, not the upper.
Create a conditional background color, e.g. If([Pareto Class] = 'C', LightRed(), If([Pareto Class] = 'B', Yellow()))
You should now have a table similar to the following. In it you can clearly see the classification of different products.
In this table, there are five different expressions that you can use for Pareto analysis. The graph in the beginning of this post uses Sales and Inclusive Percentage for the bars and the line, respectively; and Pareto Class for the coloring of the bars.
Further, you may want to combine the Pareto Class and the Exclusive Percentage into one expression:
I seem to be having an issue with my Percentages. My Total and Accumulated columns are fine however the Inclusive and Exclusive percentages look way out..... :
As usual it was another informative article from you. I have a question, maybe i haven't really understood the concept correctly and will appreciate some insights from someone of your calibre.
When we talk of the Pareto Analysis or to say 80:20 rule we are trying to derive something like 20% of the Customers generate 80%(or maybe 75 - 90%) Sales. In the example above and most of the examples ive seen shows which Customers account for 80% of the Sales but it never takes into account the percentage those Customers account for ie it shows the 80% side of the concept but takes for granted the 20% part of the concept. Is this how this is supposed to be or am i missing something. Though it is likely to be the latter, i would appreciate your feedback/explanation on this.
To say that "20% of the Customers generate 80% of the Sales" is just a rule of thumb. In other words - a gross over-simplification. But it could also be a hypothesis that you want to test.
The above analysis gives you a tool to test this hypothesis. The method really answers the question "Which customers contribute to the first 80% of the sales?" The answer is different in different data sets. In your data, it could be 10% of the customers. Or it could be 40%. In any case, it is a tool that will help you understand your own data and help you prioritize your customers.
Let's say I already built 2 pareto charts, the first related to Previous Year and the second related to Current Year. I want to analyze now class status changes to build a new report listing all class changes occurred in the form: <Previous Year Class->Current Year Class> i.e. A->A, A->B, B->A, etc.
For example:
CUSTOMER
PREVIOUS
CURRENT
CHANGE
Customer 01
A
B
A->B
Customer 02
B
C
B->C
Customer 03
B
A
B->A
The only idea I have is to export the two pareto charts in .qvd format, and then re-import them again, building at script level a single new table.
But this requires macro development.
Could you suggest a different solution to this problem?
It sounds as if you don't need a dynamic classification, i.e. you don't want the classification to change with your selections. If so, you could use the above logic, but in the script instead of in a chart. And then you could do it for Customers and per year.
Do you have some sample data? This could make a good blog post...
HIC, do you have other examples that explains different kind of Analysis/Charts like this one? As a BI Developer, I always strive to choose the best visual for the available data.