
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Enhance the performance of grouping with count distinct in the Qlik Sense Script
Hello,
I am trying to use group by with count distinct inside my script.
below is a part of the script where I will have about 26 millions records to do a group by and get only few records instead or row data so I will have a small dashboard instead of a big size dashboard with low performance.
What I am facing is the slowness of the group by with the count distinct where I tried to use the order by (https://community.qlik.com/t5/QlikView-Documents/Optimize-Group-By-Performance/ta-p/1481470)
to enhance the performance but still it is taking for each section about 2 minutes to finish and I have about 25 sections where I have to do the same way with different dimensions
QTDGroupingTableACGlobal:
NoConcatenate
Load
Year,
IsQTD,
IsYTD,
[KPI Name],
[Player Days SK],
[Customer SK]
Resident GroupingTableAC_Temp
where IsQTD = 1
and IsYTD = 1
order by Year,
IsQTD,
IsYTD,
[KPI Name];
//Global
GroupingTable:
Load Year, IsQTD,IsYTD,[KPI Name],
'QTD' as [Period],
'Total' as [Grouping],
count(DISTINCT [Player Days SK]) as [Player Days Count],
count(DISTINCT [Customer SK]) as [Customer Count]
resident QTDGroupingTableACGlobal
group by Year,IsQTD,IsYTD,[KPI Name];
Any Idea what could be the best practice to enhance the performance of the load so my task will not take more than 10-20 min max.
Thanks.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @EISSA ,
I tested your case with random data. I have setup 26 mln rows and tried to optimize your loading time. And I did manage to load the data in 44 seconds instead of 2 minutes. To achieve this result you will have to need the Source as a QVD file and you will have to add two fields to the source when writing to disk.
Because I have used some mockup data your results can be different. I am curious if you can accomplish the same results with the following script.
Regards Eddie
// Preparation
// ----------
// Write your source to a QVD
// Add two fieldnames:
// 1. Year&':'&IsQTD&':'&IsYTD&':'&[KPI Name]&':'&[Player Days SK]&':'&[Customer SK] as Key
// 2. IsQTD&IsYTD as ExistsField
// Loading the 'existsfield' because we want to have an optimized QVD load in next step
Exists_TMP:
LOAD * INLINE [ExistsField
11
];
// Loading from source optimized
QTDGroupingTableACGlobal:
LOAD *
FROM [lib://Temp QVD/Temp.qvd]
(qvd)
Where Exists(ExistsField)
;
// drop the temporary table
drop table Exists_TMP;
// Table with preceding loads, read from bottom to top
GroupingTable:
NoConcatenate
// Again we extract the values and loading all fields necessary
LOAD subfield(Index,':',1) as Year
, subfield(Index,':',2) as IsQTD
, subfield(Index,':',3) as IsYTD
, subfield(Index,':',4) as [KPI Name]
, 'QTD' as [Period]
, 'Total' as [Grouping]
, [Player Days Count]
, [Customer Count]
;
// Count the measures you want (notice, no distinct because we already retrieved the distinct values)
LOAD Index
, count([Player Days SK]) as [Player Days Count]
, count([Customer SK]) as [Customer Count]
group by Index
;
// Making a new key
Load Year&':'&IsQTD&':'&IsYTD&':'&[KPI Name] as Index
, *
;
// Extract fieldvalues with subfield from Key
LOAD subfield(Key,':',1) as Year
, subfield(Key,':',2) as IsQTD
, subfield(Key,':',3) as IsYTD
, subfield(Key,':',4) as [KPI Name]
, 'QTD' as [Period]
, 'Total' as [Grouping]
, subfield(Key,':',5) as [Player Days SK]
, subfield(Key,':',6) as [Customer SK]
;
;
// Loading all unique values from Key
LOAD FieldValue('Key', recno()) as Key
AUTOGENERATE FieldValueCount('Key');
DROP TABLE QTDGroupingTableACGlobal
;
