
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How to do count distinct in script
Hi Team,
I am a newbie in qlikview, so not much aware of scripting principals.
I have following data :
Year Month Customer OrderID DistinctCustomerCount
2013 Jan A 1 1
2013 Jan A 2 0
2013 Jan A 3 0
2013 Jun B 4 1
Out of this DistinctCustomerCount isto be calculated at load time. Logic would be to look at year month combination and pick unique customer only on first occurance. So in given situation DistinctCustomerCount = 1 for first record and last record.
I know that this can be done at run time by count ( Distinct (customer) ) function, but what i want to know is
1 ) how can this functionality be done in script
2) Is this possiblw with Load statement?


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HI Rahul
you can use distinct function and count function with group by clause like
Load Distinct
x,
y,
count(x) as number
from abc group by x,y;
Regards
Kushal Thakral

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rahul,
Is there a particular reason you can't do this at runtime? count(distinct Customer) will give you the number of Customers based on the current selection but remember that using simple Set Analysis count({1} distinct Customer) will give you the number of Customers across the total document irrespective of current selections.
Normally a count is flagged as 1/0 in the script to allow a simple SUM(Flag) to return the required count value on large datasets.
If you want to create a summary table the method suggested by Kushal will allow you to do this. Include the relevant dimensions you want to aggregate over.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Kushal / Steeve,
Thank You for your prompt reply.
Steve,
I want to use count as flag ( 1/0) to allow simple SUM(Flag) as you said, thats what i want to do.
Can you please explain how can i do it with this data set?
Or should i use method Kushal has wrote?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am not sure if this is what you are looking.
LOAD Year,
Month,
Customer,
1 as DistinctCustomerCount,
Min(OrderID) as OrderID
Group by Year, Month, Customer;
LOAD * INLINE [
Year,Month,Customer,OrderID
2013,Jan,A,1
2013,Jan,A,2
2013,Jan,A,3
2013,Jun,B,4
];

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I understand correctly you're trying to calculate the number of distinct customers within each month.
LOAD Month, count(distinct Customer) as #DistinctCustomerInMonth
resident OrderID
group by Month;

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rahul,
i would not recommend your approach.
Imagine your row with OrderID 2 takes place in Feb instead of Jan.
Since your DistinctCustomerCount only applies to the very first row the customer will not be counted if data is reduced to Feb, although there actually was an order in this period. Although tecnically working, your reports will deliver misleading information!
I can imagine two solutions:
a) use count(distinct Customer) within your expression
b) split your table apart
- a dimension table "CustomerDim" holding all customer attribues (one row per customer) plus "DistinctCustomerCount" on each row
- a dimension table "OrderDim" holding all order attributes (one row per order) plus "OrderCount" (with integrated order date dimension by adding "OrderDateYear", "OrderDateMonth" and "OrderDate" fields on each row)
- a fact table "OrderFact" holding only "CustomerKey" and "OrderKey" (wether you hold your Order metrics within this table or your order dimension is up to you)
cheers
