Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have loaded some data (around 800.000 x 4) in my application Qlik but I have some duplicates for each person (example : Address_1 with validity from 01/01/14 - 31/12/14, Address_2 with validity from 01/01/15 - 31/12/99). In other to have unique data, I would like to select only the most recent data for each ID person. How I can do it ?
Best.
Brahim.
Hi,
You don´t have to group by all the fields.
Try this:
Adresses:
LOAD
[Business Partner Person],
[Adresses.Address Type],
Adresses.Source,
Adresses.VF,
Adresses.VT,
[Adresses.Conflict Code],
[Adresses.Location Code],
[Adresses.Number From],
[Adresses.Number To],
[Adresses.PO Box],
Adresses.Validity_01_01_2015
FROM Adresses.qvd (qvd);
Right Join(Adresses)
LOAD
[Business Partner Person],
Max(Adresses.VT) as MaxDate_VT_Adresses
Resident Adresses
Group by [Business Partner Person];
Hi,
Try this:
Data:
LOAD
*
Inline [
Person_id, Adress, Validity_Start, Validity_End
1, Adress_1, 01/01/2014, 31/12/2014
1, Adress_2, 01/01/2015, 31/12/9999
];
Right Join(Data)
LOAD
Person_id,
LastValue(Validity_Start) as "Validity_Start"
Resident Data
Group by Person_id
Order by Validity_Start asc;
Hi Brahim,
You can always use aggregation, in this case MAX(StartDate) for the address.
I will give you an example using Excel in a moment.
Hope helps,
David Sugito
Mobile: + 62 878 0888 9871
Phone: + 62 21 569 823 85 / 86
Email: me@davidshuang.com
Site: davidshuang.com
Whoops Federico,
Great, now I know that QlikView has it!
Never used LastValue() before!
Thanks for sharing!
And btw, I've attached your code too.
David Sugito
Mobile: + 62 878 0888 9871
Phone: + 62 21 569 823 85 / 86
Email: me@davidshuang.com
Site: davidshuang.com
Hello Frederico,
Thanks a lot for your answer...but what if I want to keep all data fields (as in the code) and keep only the max of the field date for each person id ?
I have tried a group by but this only work when I have this case:
Work Fine
Adresses:
LOAD [Business Partner Person],
Max(Adresses.VT) as MaxDate_VT_Adresses,
FROM
Adresses.qvd
(qvd)
Group By [Business Partner Person]
Doesn't work
Adresses:
LOAD[Business Partner Person],
[Adresses.Address Type],
Adresses.Source,
Adresses.VF,
Max(Adresses.VT) as MaxDate_VT_Adresses,
[Adresses.Conflict Code],
[Adresses.Location Code],
[Adresses.Number From],
[Adresses.Number To],
[Adresses.PO Box],
Adresses.Validity_01_01_2015
FROM
Adresses.qvd
(qvd)
GroupBy[Business Partner Person],[Adresses.Address Type],
Adresses.Source,
Adresses.VF,
[Adresses.Conflict Code],
[Adresses.Location Code],
[Adresses.Number From],
[Adresses.Number To],
[Adresses.PO Box],
Adresses.Validity_01_01_2015;
Best Regards
Brahim.
Hi,
You don´t have to group by all the fields.
Try this:
Adresses:
LOAD
[Business Partner Person],
[Adresses.Address Type],
Adresses.Source,
Adresses.VF,
Adresses.VT,
[Adresses.Conflict Code],
[Adresses.Location Code],
[Adresses.Number From],
[Adresses.Number To],
[Adresses.PO Box],
Adresses.Validity_01_01_2015
FROM Adresses.qvd (qvd);
Right Join(Adresses)
LOAD
[Business Partner Person],
Max(Adresses.VT) as MaxDate_VT_Adresses
Resident Adresses
Group by [Business Partner Person];
Frederico,
Excellent use of the Join....it works...Thanks a lot and have a nice day.
Brahim.