Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I have a data set where I have to find rank of rows based on many fields and then based on the rank I have to sum the data.
So, I am using rowno and autonumber to find the rank. The problem is since I have many fields in the row, when I work with sample data it is fine but with actual set it is very very slow and undoable. Please advice.
Attaching sample.
My script:
data:
LOAD Market,
[Planning Article],
Article,
Period,
W_Period,
WS_Qty
FROM
[..\..\..\Users\user\Desktop\data.xlsx]
(ooxml, embedded labels, table is sample.xls);
Rank_Temp1:
LOAD RowNo() as Rowno,Market, Period,W_Period,
[Planning Article],Article
Market&Period&W_Period&[Planning Article]&Article as RankKey,
Sum(WS_Qty) as Sum_WS_Qty
Resident data
Group By
Market,Period,W_Period,
[Planning Article],Article
Order By W_Period Desc;
drop table data;
Rank_Temp2:
LOAD *,
AutoNumber(Rowno,RankKey) as Rank
Resident Rank_Temp1;
DROP Table Rank_Temp1;
Regards,
Ruchi
Do it in a single pass like this:
data:
LOAD
RowNo() as Rowno,
Market,
Period,
W_Period,
[Planning Article],
Article,
Sum(WS_Qty) as Sum_WS_Qty,
AutoNumber(RowNo(), RankKey) as Rank
Group By Market,
Period,
W_Period,
[Planning Article],
Article,
RankKey
;
LOAD Market,
[Planning Article],
Article,
Period,
W_Period,
WS_Qty,
Market & Period & W_Period & [Planning Article] & Article as RankKey
FROM [..\..\..\Users\user\Desktop\data.xlsx]
(ooxml, embedded labels, table is sample.xls);
Hi Jonanthan,
Thank you for the response. I tried this but again same problem. Its taking just too much time so that I have to abort the reload, before it finishes.
I tried Autonumberhash128, but that doesnt give desired result. Can you please suggest some alternative to autonumber here?
Regards,
Ruchi
Aggregation-loads with group by are always quite slow compared with "normal" loadings. Therefore you should check if you such load really needs or if it's not possible to aggregate this within the gui. Especially if you have many fields within the group by which indicates that the compression-rate of records is rather small - maybe you don't need all fields within the group by then it could be a benefit to split these within a separate load within a particular table or to merge the tables again.
Another point is your order by within the load - most often it's faster to split load-statements in this order:
1. load with where-clause
2. load with order by
3. load with group by
instead of combining these within a single-load-statement.
If this didn't speed up the load how do you it needed you will need to use an incremental approach. Here you will find various links to them and some other helpful topics, too: Advanced topics for creating a qlik datamodel.
- Marcus
Hi Marcus,
I first did it in gui using rank function but due to complex nature of reports, the performance of highly degraded. I have tried to break down the load statement in many bits, save as QVDs asn then use it as well. But no good.Thanks for the link!
One question, will here FirstSortedvalue work as an alternative to autonumber for ranking?
Regards,
Ruchi
Hi Ruchi,
I don't think that firstsortedvalue() will help or change it significantly then it's also an aggregation-function. Beside them I could imagine that there are further possibilities within the datamodel to improve the performance and to make these calculations within the gui because you are loading the data from an single sheet in excel which meant that the amount of data is rather small - max. number of records about 1 M.
- Marcus
Hi Marcus,
Yes trying to explore changes in the data model. Thank you.
Regards,
Ruchi