Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have the following data (table):
[activities]:
Load * Inline
[
id, task_id, activity,
1, 1, "run"
2, 1, "sleep"
3, 2, "eat"
4, 1, "eat"
5, 2, "sleep",
6, 3, "run"
]
;
I want to keep only the most recent records for each task_id (and the ones with the largest id are the most recent). So the result should look like this:
Can you help us with this? We tried the following solution:
[activities]:
Load * Inline
[
id, task_id, activity,
1, 1, "run"
2, 1, "sleep"
3, 2, "eat"
4, 1, "eat"
5, 2, "sleep",
6, 3, "run"
]
;
[temp1]:
Load
task_id,
Max(id) as max_id
Resident activities
Group By task_id;
Left Join (temp1)
Load
id,
task_id,
activity
Resident activities;
Drop Table activities;
Rename Table temp1 to activities;
but it didn't work:
[activities]:
Load * Inline [
id, task_id, activity,
1, 1, "run"
2, 1, "sleep"
3, 2, "eat"
4, 1, "eat"
5, 2, "sleep",
6, 3, "run" ];
Inner Join
LOAD
Max(id) as id
Resident activities
Group By task_id;
Just change the field name here:
[temp1]:
Load
task_id,
Max(id) as id
Resident activities
Group By task_id;
[activities]:
Load * Inline [
id, task_id, activity,
1, 1, "run"
2, 1, "sleep"
3, 2, "eat"
4, 1, "eat"
5, 2, "sleep",
6, 3, "run" ];
Inner Join
LOAD
Max(id) as id
Resident activities
Group By task_id;
You can achieve the same with where exists try below
[temp1]:
Load
task_id,
Max(id) as max_id
Resident activities
Group By task_id;
Noconcatenate
Final:
Load
id,
task_id,
activity
Resident activities
Where exists (id,max_id)
Drop Table activities, temp1;
@vincent_ardiet_ It works, but I don't understand why: why it worked when I changed it to be "id"? Because, even when it's not "id" but "max_id", isn't it supposed to left-join by task_id (and only keep those from temp1?)
Working Solution:
[activities]:
Load * Inline
[
id, task_id, activity,
1, 1, "run"
2, 1, "sleep"
3, 2, "eat"
4, 1, "eat"
5, 2, "sleep",
6, 3, "run"
]
;
[temp1]:
Load
task_id,
Max(id) as id
Resident activities
Group By task_id;
Left Join (temp1)
Load
id,
task_id,
activity
Resident activities;
Drop Table activities;
Rename Table temp1 to activities;
Because your task_id are not unique values.
[temp1] contents:
max_id, task_id
4, 1, "eat"
5, 2, "sleep",
6, 3, "run"
Then you join with [activities], the join is using only common fields, here there is only [task_id].
QS is keeping in [activities] all rows with a task_id among 1, 2 and 3. So everything.
It works and even more elegant (as it has less lines of code).
When I tried to run, I got the following error: Field 'max_id' not found
Yes because it should be
Where exists (max_id,id)
Oh yes! Thank you @vincent_ardiet_