Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
chintan1989
Contributor II
Contributor II

Conditional Filter in Script

Hi Friends

I have 2 columns -  ID and Status.  

For any ID's which are repeating and also having Status ='Completed' or 'Retained'  , need to be removed.

Example: In attached Excel,  ID =1 and ID=45  is satisfying above condition and should be removed.    

I want to handle in scripting part.

 

Thanks in advance for your help.

 

 

Thanks & Regards

Chintan 

1 Solution

Accepted Solutions
DavidM
Partner - Creator II
Partner - Creator II

ID29 has statuses 'Completed' and 'Cancelled' - that one should not be removed? In that case, the condition you're looking for should be:

For any ID's which are repeating and also having Status ='Completed' AND 'Retained'

?

EDIT:

If this is true you need to do it in couple of steps:

A:
LOAD
ID,
Status
FROM [lib://dt/Test_Logic.xlsx]
(ooxml, embedded labels, table is Sheet1)
;

Left Join (A)
LOAD
ID,
count(distinct(Status)) as idc
Resident A
Group by ID
;

Now you identified those ID's which have more than 1 status. If you need to remove those which have 'Completed' AND 'Retained' but keep with status 'Cancelled' you need to do this:

B:
Load
ID,
Status
Resident A
Where idc=1;

First you load those which have only 1 ID.

 

Outer join (B)
Load
ID,
Status
Resident A
Where Status='Cancelled' and idc>1;
Drop Table A;

Now you join those with two ID's but one of those has status 'Cancelled' (note: it will join with Status cancelled and only one line). If you need to remove all with more than one status you don't need the second part, you can do it with inner join and where idc=1.

 

 

View solution in original post

4 Replies
DavidM
Partner - Creator II
Partner - Creator II

ID29 has statuses 'Completed' and 'Cancelled' - that one should not be removed? In that case, the condition you're looking for should be:

For any ID's which are repeating and also having Status ='Completed' AND 'Retained'

?

EDIT:

If this is true you need to do it in couple of steps:

A:
LOAD
ID,
Status
FROM [lib://dt/Test_Logic.xlsx]
(ooxml, embedded labels, table is Sheet1)
;

Left Join (A)
LOAD
ID,
count(distinct(Status)) as idc
Resident A
Group by ID
;

Now you identified those ID's which have more than 1 status. If you need to remove those which have 'Completed' AND 'Retained' but keep with status 'Cancelled' you need to do this:

B:
Load
ID,
Status
Resident A
Where idc=1;

First you load those which have only 1 ID.

 

Outer join (B)
Load
ID,
Status
Resident A
Where Status='Cancelled' and idc>1;
Drop Table A;

Now you join those with two ID's but one of those has status 'Cancelled' (note: it will join with Status cancelled and only one line). If you need to remove all with more than one status you don't need the second part, you can do it with inner join and where idc=1.

 

 

agigliotti
Partner - Champion
Partner - Champion

maybe someting as below:

tab:
LOAD
*
where ret > 1;
LOAD
ID,
count(Status) as ret
from ............
where ( Status = 'Completed' or Status = 'Retained' )
group by ID;

I hope it can helps.

chintan1989
Contributor II
Contributor II
Author

Hello David

Thanks for replying.

Need to remove only the ones which are Cancelled and Retained.  

 

Thanks & Regards

Chintan Gala

chintan1989
Contributor II
Contributor II
Author

Thanks David, its working fine.