Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi
I'm trying to delete duplicate rows (activities) where the fields Start, End and Status are all empty. So if an activity exists more than once (like Activity A), the empty ones should be deleted (like Activity_ID 001). But only if all 3 fields are empty
Activity | Start | End | Status | Activity_ID |
---|---|---|---|---|
A | 001 | |||
A | 01-01-2014 | 05-05-2014 | Completed | 002 |
B | 003 | |||
C | 004 | |||
D | 05-05-2015 | Waiting | 005 |
Thanks
Hi
This works, but makes the assumptions that there is only one populated line, but there may be one or more unpopulated lines:
Data:
LOAD Activity,
Start,
End,
Status,
Activity_ID,
If(Len(Start) = 0 And Len(End) = 0 And Len(Status) = 0, Null(), Activity_ID) As TActivity_ID
Inline
[
Activity,Start,End,Status,Activity_ID
A,,,,001
A,01-01-2014,05-05-2014,Completed,002
B,,,,003
C,,,,004
D,05-05-2015,,Waiting,005
];
Results:
NoConcatenate
LOAD Activity,
Max(Start) As Start,
Max(End) As End,
MaxString(Status) As Status,
Num(Max(Alt(TActivity_ID, Activity_ID), '000')) As Activity_ID
Resident Data
Group By Activity;
DROP Table Data;
The TActivity_ID field stores the Activity_ID of the populated field. The Alt() selects this value if it exists.
See attached
Jonathan
Load * From Table
Where Len(Trim(Start)) = 0 And Len(Trim(End)) = 0 And Len(Trim(Status)) = 0
Hi
Have a look this one.
Thanks
your suggestion deletes all empty rows. Only rows that duplicate AND empty should be deleted.
Activity A exists twice and one row (Activity_ID 001) is completely empty, so that row should be deleted.
Temp:
Load * From Table
Join(Temp)
Load Activity,Count(Activity) As Cnt Resident Temp;
Final:
NoConcatenate
Load * Resident Temp
Where Len(Trim(Start)) = 0 And Len(Trim(End)) = 0 And Len(Trim(Status)) = 0 And Cnt > 1;
Drop Table Temp;
modify Where statement.
Where Cnt=1 or (Len(Trim(Start))>0 AND Len(Trim(End))>0 AND Len(Trim(Status))>0)
Hi
This works, but makes the assumptions that there is only one populated line, but there may be one or more unpopulated lines:
Data:
LOAD Activity,
Start,
End,
Status,
Activity_ID,
If(Len(Start) = 0 And Len(End) = 0 And Len(Status) = 0, Null(), Activity_ID) As TActivity_ID
Inline
[
Activity,Start,End,Status,Activity_ID
A,,,,001
A,01-01-2014,05-05-2014,Completed,002
B,,,,003
C,,,,004
D,05-05-2015,,Waiting,005
];
Results:
NoConcatenate
LOAD Activity,
Max(Start) As Start,
Max(End) As End,
MaxString(Status) As Status,
Num(Max(Alt(TActivity_ID, Activity_ID), '000')) As Activity_ID
Resident Data
Group By Activity;
DROP Table Data;
The TActivity_ID field stores the Activity_ID of the populated field. The Alt() selects this value if it exists.
See attached
Jonathan
Hi,
PFA,
By using staight table, you can acheive your requirement.
Warm Regards,
Joshmi