Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Most recent Data - Duplicates

Hi,

I have loaded some data (around 800.000 x 4) in my application Qlik but I have some duplicates for each person (example : Address_1 with validity from 01/01/14 - 31/12/14, Address_2 with validity from 01/01/15 - 31/12/99). In other to have unique data, I would like to select only the most recent data for each ID person. How I can do it ?

Best.

Brahim.

1 Solution

Accepted Solutions
fvelascog72
Partner - Specialist
Partner - Specialist

Hi,

You don´t have to group by all the fields.

Try this:

Adresses:

LOAD

[Business Partner Person],
[Adresses.Address Type],
Adresses.Source,
Adresses.VF,

Adresses.VT,
[Adresses.Conflict Code],
[Adresses.Location Code],
[Adresses.Number From],
[Adresses.Number To],
[Adresses.PO Box],
Adresses.Validity_01_01_2015
FROM Adresses.qvd (qvd);


Right Join(Adresses)

LOAD

[Business Partner Person],

Max(Adresses.VT) as MaxDate_VT_Adresses

Resident Adresses

Group by [Business Partner Person];

View solution in original post

6 Replies
fvelascog72
Partner - Specialist
Partner - Specialist

Hi,

Try this:

Data:

LOAD

*

Inline [

Person_id, Adress, Validity_Start, Validity_End

1, Adress_1, 01/01/2014, 31/12/2014

1, Adress_2, 01/01/2015, 31/12/9999

];

Right Join(Data)

LOAD

  Person_id,

  LastValue(Validity_Start) as "Validity_Start"

Resident Data

Group by Person_id

Order by Validity_Start asc;

Not applicable
Author

Hi Brahim,

You can always use aggregation, in this case MAX(StartDate) for the address.

I will give you an example using Excel in a moment.

Hope helps,

David Sugito

Mobile: + 62 878 0888 9871
Phone: + 62 21 569 823 85 / 86
Email: me@davidshuang.com
Site: davidshuang.com

Not applicable
Author

Whoops Federico,

Great, now I know that QlikView has it!

Never used LastValue() before!

Thanks for sharing!

And btw, I've attached your code too.


David Sugito

Mobile: + 62 878 0888 9871
Phone: + 62 21 569 823 85 / 86
Email: me@davidshuang.com
Site: davidshuang.com

Not applicable
Author

Hello Frederico,

Thanks a lot for your answer...but what if I want to keep all data fields (as in the code) and keep only the max of the field date for each person id ?

I have tried a group by but this only work when I have this case:

Work Fine

Adresses:

LOAD [Business Partner Person],

Max(Adresses.VT) as MaxDate_VT_Adresses,

FROM
Adresses.qvd
(
qvd)

Group By [Business Partner Person]

Doesn't work

Adresses:

LOAD[Business Partner Person],
[Adresses.Address Type],
Adresses.Source,
Adresses.VF,

Max(Adresses.VT) as MaxDate_VT_Adresses,
[Adresses.Conflict Code],
[Adresses.Location Code],
[Adresses.Number From],
[Adresses.Number To],
[Adresses.PO Box],
Adresses.Validity_01_01_2015
FROM
Adresses.qvd
(
qvd)

GroupBy[Business Partner Person],[Adresses.Address Type],
Adresses.Source,
Adresses.VF,
[Adresses.Conflict Code],
[Adresses.Location Code],
[Adresses.Number From],
[Adresses.Number To],
[Adresses.PO Box],
Adresses.Validity_01_01_2015
;

Best Regards

Brahim.

fvelascog72
Partner - Specialist
Partner - Specialist

Hi,

You don´t have to group by all the fields.

Try this:

Adresses:

LOAD

[Business Partner Person],
[Adresses.Address Type],
Adresses.Source,
Adresses.VF,

Adresses.VT,
[Adresses.Conflict Code],
[Adresses.Location Code],
[Adresses.Number From],
[Adresses.Number To],
[Adresses.PO Box],
Adresses.Validity_01_01_2015
FROM Adresses.qvd (qvd);


Right Join(Adresses)

LOAD

[Business Partner Person],

Max(Adresses.VT) as MaxDate_VT_Adresses

Resident Adresses

Group by [Business Partner Person];

Not applicable
Author

Frederico,

Excellent use of the Join....it works...Thanks a lot and have a nice day.

Brahim.