Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

sorting then removing duplicates

When loading the data I need to keep only the latest record.

I don't have a date to rely on to determine the latest record but I have a status, based on which I can sort in a somewhat chronological order.

I read that Order By can only be used on Resident Loads and that removing duplicates was achieved by joining the data with itself.

The sorting by Status works fine.

but it seems I can't join a resident load with itself

[Raw]:

LOAD

...

  [STATUS],

If([STATUS]='Canceled', '1',  If([STATUS]='Closed', '2', If([STATUS]='Transmited', '3',))) as 'Status ID',

...

FROM [....]

[Ordered]:

Load  *

Resident [Raw] Order By [Status ID];

Join ([Ordered])

[Unique]

Load *

Resident [Ordered]

Group By [Status ID];

1 Solution

Accepted Solutions
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I assume you have an ID field that identifies each "thing" (case, order whatever) that you want to keep. Maybe you are looking for something like this:

Raw:

LOAD *,

  Match(Status,'Cancelled','Closed','Transmitted')  as StatusID

INLINE [

ID, Status, Foo

1, Cancelled, A1

1, Closed, A2

2, Transmitted, Z1

3, Closed, Y1

3, Transmitted, Y2

];

INNER JOIN (Raw)

LOAD

  ID,

  min(StatusID) as StatusID

Resident Raw

Group by ID;

-Rob

http://masterssummit.com

http://qlikviewcookbook.com

View solution in original post

2 Replies
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I assume you have an ID field that identifies each "thing" (case, order whatever) that you want to keep. Maybe you are looking for something like this:

Raw:

LOAD *,

  Match(Status,'Cancelled','Closed','Transmitted')  as StatusID

INLINE [

ID, Status, Foo

1, Cancelled, A1

1, Closed, A2

2, Transmitted, Z1

3, Closed, Y1

3, Transmitted, Y2

];

INNER JOIN (Raw)

LOAD

  ID,

  min(StatusID) as StatusID

Resident Raw

Group by ID;

-Rob

http://masterssummit.com

http://qlikviewcookbook.com

Not applicable
Author

Thanks Rob!

 

I like the min(StatusID) trick, pretty easier than my sorting then filtering...

 

 

I like the use of match() to associate numbers/id to strings too. In my case I actually need a default case but because match() returns 0 when no match is found it still works fine,  I just reverted the order of the various statuses, and used max() instead of min().

 

 

Great tricks, thanks for your help