Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to do count distinct in script


Hi Team,

I am a newbie in qlikview, so not much aware of scripting principals.

I have following data :

Year          Month          Customer          OrderID          DistinctCustomerCount

2013          Jan                    A                    1                      1

2013          Jan                    A                    2                      0

2013          Jan                    A                    3                      0

2013          Jun                    B                    4                      1

Out of this DistinctCustomerCount isto be calculated at load time. Logic would be to look at year month combination and pick unique customer only on first occurance. So in given situation DistinctCustomerCount = 1 for first record and last record.

I know that this can be done at run time by count ( Distinct (customer) ) function, but what i want to know is

1 ) how can this functionality be done in script

2) Is this possiblw with Load statement?

Capture.JPG.jpg

Capture.JPG.jpg

6 Replies
kushalthakral
Creator III

HI Rahul

you can use distinct function and count function with group by clause like

Load Distinct

x,

y,

count(x) as number

from abc group by x,y;

Regards

Kushal Thakral

Anonymous
Not applicable
Author

Hi Rahul,

Is there a particular reason you can't do this at runtime? count(distinct Customer) will give you the number of Customers based on the current selection but remember that using simple Set Analysis count({1} distinct Customer) will give you the number of Customers across the total document irrespective of current selections.

Normally a count is flagged as 1/0 in the script to allow a simple SUM(Flag) to return the required count value on large datasets.

If you want to create a summary table the method suggested by Kushal will allow you to do this. Include the relevant dimensions you want to aggregate over.

Not applicable
Author

Hello Kushal / Steeve,

Thank You for your prompt reply.

Steve,

I want to use count as flag ( 1/0) to allow simple SUM(Flag) as you said, thats what i want to do.

Can you please explain how can i do it with this data set?

Or should i use method Kushal has wrote?

Anonymous
Not applicable
Author

Hi,

I am not sure if this is what you are looking.

LOAD Year,

  Month,

  Customer,

  1 as DistinctCustomerCount,

  Min(OrderID) as OrderID

Group by Year, Month, Customer;

LOAD * INLINE [

    Year,Month,Customer,OrderID

    2013,Jan,A,1

    2013,Jan,A,2

    2013,Jan,A,3

    2013,Jun,B,4

];

Anonymous
Not applicable
Author

If I understand correctly you're trying to calculate the number of distinct customers within each month.

LOAD Month, count(distinct Customer) as #DistinctCustomerInMonth

resident OrderID

group by Month;

gerhard_jakubec
Contributor III

Hi Rahul,

i would not recommend your approach.

Imagine your row with OrderID 2 takes place in Feb instead of Jan.

Since your DistinctCustomerCount only applies to the very first row the customer will not be counted if data is reduced to Feb, although there actually was an order in this period. Although tecnically working, your reports will deliver misleading information!

I can imagine two solutions:

a) use count(distinct Customer) within your expression

b) split your table apart

  • a dimension table "CustomerDim" holding all customer attribues (one row per customer) plus "DistinctCustomerCount" on each row
  • a dimension table "OrderDim" holding all order attributes (one row per order) plus "OrderCount" (with integrated order date dimension by adding "OrderDateYear", "OrderDateMonth" and "OrderDate" fields on each row)
  • a fact table "OrderFact" holding only "CustomerKey" and "OrderKey" (wether you hold your Order metrics within this table or your order dimension is up to you)

cheers