Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik and ServiceNow Partner to Bring Trusted Enterprise Context into AI-Powered Workflows. Learn More!
hic
Former Employee
Former Employee

“Which products contribute to the first 80% of our turnover?”

This type of question is common in all types of business intelligence. I say “type of question” since it appears in many different forms: Sometimes it concerns products, but it can just as well concern any dimension, e.g. customer, supplier, sales person, etc. Further, here the question was about turnover, but it can just as well be e.g. number of support cases, or number of defect deliveries, etc.

 

QV Bar chart.png

 

It is called Pareto analysis or ABC analysis and I have already written a blog post on this topic. However, in the previous post I only explained how to create a measure which showed the Pareto class. I never showed how to create a dimension based on a Pareto classification – simply because it wasn’t possible.

 

But now it is.

 

But first things first. The logic for a Pareto analysis is that you first sort the products according to their sales numbers, then accumulate the numbers, and finally calculate the accumulated measure as a percentage of the total. The products contributing to the first 80% are your best, your “A” products. The next 10% are your “B” products, and the last 10% are your “C” products. In the above graph, these classes are shown as colors on the bars.

 

The previous post shows how this can be done in a chart measure using the Above() function. However, if you use the same logic, but instead inside a sorted Aggr() function, you can achieve the same thing without relying on the chart sort order. The sorted Aggr() function is a fairly recent innovation, and you can read more about it here.

 

The sorting is needed to calculate the proper accumulated percentages, which will give you the Pareto classes. So if you want to classify your products, the new expression to use is

 

=Aggr(
    If(Rangesum(Above(Sum({1} Sales)/Sum({1} total Sales),1,RowNo()))<0.8, 'A',
        If(Rangesum(Above(Sum({1} Sales)/Sum({1} total Sales),1,RowNo()))<0.9, 'B',
            'C')),
    (Product,(=Sum({1} Sales),Desc))
    )

 

The first parameter of the Aggr() – the nested If()-functions – is in principle the same as the measure in the previous post. Look there for an explanation.

 

The second parameter of the Aggr(), the inner dimension, contains the magic of the sorted Aggr():

 

    (Product,(=Sum({1} Sales),Desc))

 

This structured parameter specifies that the field Product should be used as dimension, and its values should be sorted descending according to Sum({1} Sales). Note the equals sign. This is necessary if you want to sort by expression.

 

So the Products inside the Aggr() will be sorted descending, and for each Product the accumulated relative sales in percent will be calculated, which in turn is used to determine the Pareto classes.

 

The set analysis {1} is necessary if you want the classification to be independent of the made selection. Without it, the classification will change every time the selection changes. But perhaps a better alternative is to use {$<Product= >}. Then a selection in Product (or in the Pareto class itself) will not affect the classification, but all other selections will.

 

The expression can be used either as dimension in a chart, or in a list box. Below I have used the Pareto class as first dimension in a pivot table.

 

QS Pivot.png

 

If you use this expression in a list box, you can directly select the Pareto class you want to look at.

 

QS List box.png

 

The other measures in the pivot table are the exclusive and inclusive accumulated relative sales, respectively. I.e. the lower and upper bounds of the product sales share:

 

Exclusive accumulated relative sales (lower bound):

 

=Min(Aggr(
    Rangesum(Above(Sum({1} Sales)/Sum({1} total Sales),1,RowNo())),
    (Product,(=Sum({1} Sales),Desc))
  ))

 

Inclusive accumulated relative sales (upper bound):

 

=Max(Aggr(
    Rangesum(Above(Sum({1} Sales)/Sum({1} total Sales),0,RowNo())),
    (Product,(=Sum({1} Sales),Desc))
  ))

 

Good luck in creating your Pareto dimension!

 

HIC

 

Further reading related to this topic:

The sortable Aggr function is finally here!

Recipe for a Pareto Analysis

Recipe for an ABC Analysis

86 Comments
marcus_sommer

I could imagine that you need more as single aggr() and also don't forget to check how the behaviour is changing by adding the TOTAL to the above() and a NODISTINCT to the aggr(). Another starting point for such approaches may be the suggestion from Sunny:

Solved: Sorted Aggr function - Qlik Community - 1522082 

0 Likes
483 Views
SDT
Creator
Creator

Thank you all for the suggestions.

Still not working although I have not yet created a new field for FiscalYear_FlexID. I was hoping to avoid that so I could report by year, quarter or month. I suppose I can create all three in the script.

I will report back if that solves things.

0 Likes
463 Views
SDT
Creator
Creator

OK, tried creating a field to combine FY & FLXID and used the expression below. Still no joy. Tried it with all combinations of the recommended NODISTINCT and TOTAL modifiers suggested above.

Seems like this should not be so diffiult.

{<BridgeDateType={'Invoiced'}, OrderState={'Invoiced'}, InvoicedAmt={">0"}>}
COUNT(DISTINCT
AGGR(
If(RANGESUM(ABOVE(
SUM({<FiscalYear_FlxID=>} InvoicedAmt)/SUM({<FiscalYear_FlxID=>} TOTAL InvoicedAmt)
,1,RowNo()))<0.8, FiscalYear_FlxID),
(FiscalYear_FlxID,(=SUM({<FiscalYear_FlxID=>} InvoicedAmt),desc))))

0 Likes
454 Views
marcus_sommer

I would like to come back to an above suggestion to simplify the approach as much as possible. For this remove all set analysis because the right set analysis might need to be included within each aggregation part. This means as scheme:

count({ SET } aggr({ SET } sum( { SET } Field), DIM))

and especially the combination of an outer and an inner SET may increase the complexity.

Further you may need more as a single aggr() and I think that the if-condition should be rather applied against the aggr() and not within it.

0 Likes
445 Views
SDT
Creator
Creator

OK, 

Tried it this way and the result was far larger than it should be. Well over 1,000 when the correct number (pulled from a straight table in Excel) is 287. In addition, it still changes when I filter for more than one fiscal year.

I don't follow what is meant by "you may need more as a single aggr()".

COUNT(DISTINCT
IF(AGGR(
    RANGESUM(ABOVE(
        SUM(InvoicedAmt)/SUM(TOTAL <FiscalYear> InvoicedAmt)
        ,1,RowNo()))
    ,(FLXID,(=SUM(InvoicedAmt),desc)))>0.8,FLXID)
)

 

0 Likes
435 Views
marcus_sommer

Meant was that such an expression may need a logic like:

count(aggr(rangesum(above(sum({< X = {aggr()}>} Field))), (DIM, aggr(sum(), DIM)))

and maybe even applied in further nested ways. Because of the n possible versions to combine n aggr() with n dimensions and extra NODISTINCT/TOTAL statements it's important not to play too long with trial & error (maybe one or two hours) and then starting to develop it systematically - with small steps and comprehending by each one why an unwanted result had happens - to improve it in a next attempt or to discard this approach. 

In your approach I notice that there is no aggr() in the sorting part, only one dimension is specified in the aggr() and above is starting by 1 instead of 0.

0 Likes
407 Views