Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a large dataset of ratings, and I need to rank the latest ratings - 90% of the dataset can be ranked by MAX(RATINGID), but 10% of the dataset has a more recent RATINGDATE, where the corresponding RATINGID is not the MAX(RATINGID). For example, the latest ratings in the dataset below are the ones with a RATINGDATE of 06/09/2011 - however, the RATINGID for the ratings dated 06/09/2010 is greater, so this is what MAX(RATINGID) would return, which in this case, is incorrect - I want to rank the ratings dated 06/09/2011.
In PL/SQL, this could be achieved with a statement like this, and I would like the equivalent for this in QlikView:
RANK()OVER (PARTITION BY RTNG_TYP ORDER BY RTNG_DT DESC, RTNG_ID DESC)
I am currently ranking the data like this, but in the case of the above dataset, this code returns the ratings for 06/09/2010, which is incorrect, although this code works for 90% of the entire dataset:
LOAD 1 as CF_RNK,
max(RATINGID) as RATINGID,
RATINGTYPEDESCRIPTION
RESIDENT CF_RATINGS_SLICE
GROUP BY ISSUERID,RATINGTYPEDESCRIPTION;
So how can I achieve a solution for both scenarios in QlikView? I need a way to determine which is the correct RATINGID to rank - the MAX(RATINGID) or the RATINGID with the MAX(RATINGDATE)?
I would also need to retain the grouping by ISSUERID and RATINGTYPEDESCRIPTION, so a solution that allows me to retain the use of GROUP BY would be ideal.
Many thanks,
VB
Hi VB,
I'm not sure if I have clearly understood. Let me make sure...
You wanted to Rank by RATINGID but you wanted to first rank where RATINGDATE = 06/09/2011
And then you wanted to Rank based on RATINGID, I'm assuming all this in Desc order. Is this right? Also, please can you post some sample data in QVW file or excel file showing the output rank numbers? It will be easy to work with something where we know the output.
Thanks,
DV
Thanks for your response DV - you have a pretty clear understanding of the problem.
For 90% of my dataset, the MAX(RATINGID) correctly returns the latest rating; however, for the other 10%, the MAX(RATINGID) does not represent the latest rating - so, to cover both cases, the MAX(RATINGDATE) should be used first, followed by the MAX(RATINGID) - below is the load script code I am currently using to define a rank within the dataset - the ability to return the MAX(RATINGDATE) followed by the MAX(RATINGID) within a query such as the one below would be ideal:
CFRATINGSRANK:
LEFT JOIN (CF_RATINGS_SLICE)
LOAD 1 as CF_RNK,
max(RATINGID) as RATINGID,
RATINGTYPEDESCRIPTION
RESIDENT CF_RATINGS_SLICE
BY ISSUERID,RATINGTYPEDESCRIPTION
ORDER BY RATINGDATE desc,RATINGID desc;
Below is the dataset that is returned for one of the issuers in question - in this case, the data is incorrect, because there are more recent ratings than 06/09/2010:
Below is the data for the same ISSUERID within the Oracle database, showing that the latest ratings are actually dated '09/06/2011' - however, the RTNG_ID values for these ratings are lower than the RTNG_ID values for the ratings dated '09/06/2010', so this is why in QlikView, MAX(RATINGID) returns the ratings dated '09/06/2010':
Oracle allows you to define a rank by partitioning the data by a series of fields, e.g.:
RANK() OVER (PARTITION BY grp_id ORDER BY ratingdate DESC, rtng_id DESC) rank1
I would like the equivalent of the Oracle RANK() statement in QlikView.
Many thanks,
VB
You can use
If(Previous(Group) = Group, Peek('Rank') +1, 1) As Rank
But you need to sort the value by the fields first. Works with more than 1 field in the window
Data:
Load
*,
1 As Rank
inline
[Id, Group, Date
1, A, 01/01/2010
2, A, 02/01/2010
3, A, 04/01/2010
4, A, 10/01/2010
5, B, 01/01/2009
6, B, 01/02/2010
7, B, 01/03/2011
8, C, 01/01/2010
9, C, 01/02/2010
10, D, 06/06/2010
11, D, 02/02/2010
12, D, 05/05/2010
13, D, 03/03/2010
14, D, 04/04/2010];
Fact:
Noconcatenate Load
Id,
Group,
Date,
If(Previous(Group) = Group, Peek('Rank') +1, 1) As Rank
resident Data
order by Group, Date;
Drop table Data;
Thanks for the neat solution . I tried this
CLASSTEST:
LOAD * INLINE
[ Class , Subject , StudentID, Marks
1 , 1, 1, 90
1 , 1, 2, 80
1 , 1 , 3, 85
1, 2 , 1 , 77
1, 2, 2 , 88
1, 2, 3 , 66
1 , 3 , 1 , 55
1 , 3 , 2 , 77
1 , 3 , 3 , 88
];
TEMPCLASSTESTSORTED:
NoConcatenate LOAD
* , 1
as Subrank
RESIDENT CLASSTEST
ORDER BY Class, Subject , Marks desc;
drop
table
CLASSTEST;
FACTCLASSTEST:
NoConcatenate LOAD Class
, Subject
, StudentID
, Marks
, if (Previous(Subject) = Subject , Peek('Subrank')+1 , 1) as subjectRank
resident TEMPCLASSTESTSORTED;
drop table TEMPCLASSTESTSORTED;
This only brings up Rank 1 and subsequent ones fail ? Any ideas ?
Hi,
Your sentence it is not correct.
if (Previous(Subject) = Subject , Peek('Subrank subjectRank')+1 , 1) as subjectRank
Kind regards!