Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
When loading the data I need to keep only the latest record.
I don't have a date to rely on to determine the latest record but I have a status, based on which I can sort in a somewhat chronological order.
I read that Order By can only be used on Resident Loads and that removing duplicates was achieved by joining the data with itself.
The sorting by Status works fine.
but it seems I can't join a resident load with itself
[Raw]:
LOAD
...
[STATUS],
If([STATUS]='Canceled', '1', If([STATUS]='Closed', '2', If([STATUS]='Transmited', '3',))) as 'Status ID',
...
FROM [....]
[Ordered]:
Load *
Resident [Raw] Order By [Status ID];
Join ([Ordered])
[Unique]
Load *
Resident [Ordered]
Group By [Status ID];
I assume you have an ID field that identifies each "thing" (case, order whatever) that you want to keep. Maybe you are looking for something like this:
Raw:
LOAD *,
Match(Status,'Cancelled','Closed','Transmitted') as StatusID
INLINE [
ID, Status, Foo
1, Cancelled, A1
1, Closed, A2
2, Transmitted, Z1
3, Closed, Y1
3, Transmitted, Y2
];
INNER JOIN (Raw)
LOAD
ID,
min(StatusID) as StatusID
Resident Raw
Group by ID;
-Rob
I assume you have an ID field that identifies each "thing" (case, order whatever) that you want to keep. Maybe you are looking for something like this:
Raw:
LOAD *,
Match(Status,'Cancelled','Closed','Transmitted') as StatusID
INLINE [
ID, Status, Foo
1, Cancelled, A1
1, Closed, A2
2, Transmitted, Z1
3, Closed, Y1
3, Transmitted, Y2
];
INNER JOIN (Raw)
LOAD
ID,
min(StatusID) as StatusID
Resident Raw
Group by ID;
-Rob
Thanks Rob!
I like the min(StatusID) trick, pretty easier than my sorting then filtering...
I like the use of match() to associate numbers/id to strings too. In my case I actually need a default case but because match() returns 0 when no match is found it still works fine, I just reverted the order of the various statuses, and used max() instead of min().
Great tricks, thanks for your help