Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
It is said that using Sum() instead of count(Distinct ) the performance will increase. But I have duplicate Key values in my data.
So how can I use it. Can anyone help me on this?
Regards,
Anjali Gupta
As others have said; today you can use Count(Distinct) without a performance issue.
If you really want to use SUM instead of count(Distinct) you can do the following:
Fact:
load %Keytobecounted, Facts, numbers, etc from fact.qvd (qvd);
CountingTable:
load distinct %Keytobecounted, 1 as Keycounter resident Fact;
This creates a table with the unique keys and a field you can sum that will equal a count of keys.
now you can use the expression Sum(Keycounter) instead of Count(Distinct %Keytobecounted).
Hi Anjali,
can you please provide more clarity on your question. Distinct obviously cost the query performance.
Sreeni
did you read this?
Those days are gone. Now QlikView uses multi-threading, and there is no such performance issue with distinct clause. Read: A Myth About Count(distinct …)
Maxgro, I don't want to use count(Distinct) instead I want to use Sum (). How can I use that?
Hi,
Richard Pearce of Qlikcentral also has a blog that might useful to read.
http://qlikcentral.com/2015/08/04/count-or-sum1-which-is-best/
You can use a record counter at script level and then use sum(recordcounter)...
Hi Balraj, How to use a record counter? Can you explain.
Please provide sample data
Hi Anjali,
in you load script add an extra line of code like this:
1 AS RecordCounter
Then in the frontend you can do a Sum(RecordCounter) to count number of rows. Please beware, if for example you have data with say Orders with multiple Orderlines and you want to count unique orders then this won't work. Instead if you want to count orderlines it wil work. Otherwise you need to split the tabels (Order table and Orderline table).
Hope you get the picture.
Greetz,
Marc