Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Finding nodal median in QlikView

I have a query regarding finding median value in a hierarchical dimension in QlikView:

I have a following hierarchical dimension in QlikView. (Dimension Name : VendorSubLocation, Vendor is the parent and Location is the child).

Assume the following values (Dimension names and the expression value):

                                      Rate

IBM                                  

     USA                            12

     UK                              13

     Australia                      14

ATK

     USA                             13

     UK                               12

     SA                               16

Corresponding to each Level 1 of VendorSubLocation dimension, I want to find out median value. That is the output should be

IBM - 13

and ATK - 13

Can you please suggest a method how this can be implemented in the load script?

4 Replies
Gysbert_Wassenaar

Use the median function. You'll need a group by clause if you want to calculate this in the script.

load Vendor, median(Rate)

from ...somewhere...

group by Vendor;


talk is cheap, supply exceeds demand
Not applicable
Author

Hi Gysbert,

There is an additional complication. Corresponding to each vendors there are multiple transactions. So consider the following situation:

                                      Rate          # of transactions

IBM                                 

     USA                            12                    10

     UK                              13                     20

     Australia                      14                    61

Your solution will return 14 as the median value since Rate corresponding to  45th/46th transaction is 14. However, I want the median value as 13.

Gysbert_Wassenaar

Then first do a load distinct Vendor, Rate ...from... etc so you get the distinct Rate values


talk is cheap, supply exceeds demand
Not applicable
Author

That still does not seem to solve my problem. I think I should rephrase my issue and mention the data fields present in the fact so that it becomes simpler.

The input file has following format:

Company      Geography            Rate             Hours

IBM                    USA                 11                  1

IBM                    USA                 12                  1

IBM                    USA                 13                  1

I calculate the rate for "Company-Geography" combination at load time by grouping by "Company-Geography" combination. ( I have a "Company-Geography" concatenated field in the input file). Lets call this effective rate.

What I want to do is for a given company, I want to find the geography corresponding to median rates. There are multiple transactions for "Company-Geography" combination and effective rate can be same for any two "Company-Geography" combination (which is one of the reasons why i am concerned about using Distinct).