Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Remove duplicates with condition

Hi,

 

I have below scenario where I have to remove duplicates.  If my  ID column has duplicates and then look at the status column. If status column = Active or Inactive, then pick that row and remove other duplicate IDs for that row. I need to look my status column only when there r duplicates IDs. Example:

 

ID                           Status              Stage

000xnT2eYUTE     Active              To be billed

000xnT2eYUTE     Close               billed

000xnT2eYUTE     Close               billed

000xnT2eABCD    Inprogress        waived

 

So I want my output to be:

ID                           Status              Stage

000xnT2eYUTE     Active              To be billed

000xnT2eABCD    Inprogress        waived

 

How do I do it..?

Labels (2)
1 Solution

Accepted Solutions
vboppudi
Partner - Creator III
Partner - Creator III

Hi KanT,

 

Please try below.

0683p000009Lx3E.png0683p000009Lwwd.png0683p000009LwqV.png0683p000009LwzS.png

Regards,

View solution in original post

11 Replies
manodwhb
Champion II
Champion II

@KanT,find below.

0683p000009Lwuh.png0683p000009LwoM.png0683p000009Lx2f.png0683p000009Lwa3.png

manodwhb
Champion II
Champion II

@KanT,you can use tMap instead of tFilterRow and tFiltercoulumns.

Anonymous
Not applicable
Author

Hi Manodwhb,

 

Thanks for your reply. These screenshots are v helpful. I have few questions though:

 

1. Instead of tfixed input, I am using Tinputdelimited. I guess it shouldn't impact, but just want to crosscheck with you.

2. If status column = Active or Inactive, then pick that row and remove other duplicate IDs for that row.

 

Thanks again!

vboppudi
Partner - Creator III
Partner - Creator III

Hi KanT,

 

So as per my understanding, You need to identify duplicates based on ID. and If you have any duplicates in ID then consider only only one record based on Status. Priority of status will be like Active, Inactive, etc... correct me if i am wrong.

 

000xnT2eYUTE     Active              To be billed

000xnT2eYUTE     Close               billed

000xnT2eYUTE     Close               billed

000xnT2eY12E     InActive            billed

000xnT2eABCD    Inprogress        waived

000xnT2eABCD    Close               billed

000xnT2eABCD    Active              To be billed

000xnT2e123D    Active              To be billed

 

Output will be:

000xnT2eYUTE     Active              To be billed

 

000xnT2eY12E     InActive            billed

000xnT2eABCD    Active              To be billed

000xnT2e123D    Active              To be billed

 

Is this expected output or any changes?

Regards,

Anonymous
Not applicable
Author

Hi vboppudi,

 

Your understanding is correct. If there are duplicates, I want to look at my Status column. if Status column = Active or Inactive, then pick that row only and ignore rest duplicates. I will just modify your example a bit:

 

000xnT2eYUTE     Close              To be billed

000xnT2eYUTE     Active              billed

000xnT2eYUTE     Close               billed

000xnT2eY12E     Close               billed

000xnT2eABCD    Inprogress        waived

000xnT2eABCD    Close               billed

000xnT2eABCD    Active              To be billed

000xnT2e123D    Active              To be billed

 

Output will be:

000xnT2eYUTE     Active              billed

 000xnT2eY12E     Close            billed (ID here is not duplicate, so it will ignore status field in this case)

000xnT2eABCD    Active              To be billed

000xnT2e123D    Active              To be billed

 

How do I use this logic in taggregaterow? I did as per the screenshot mentioned in the post above but my results are not write.

 

 

vboppudi
Partner - Creator III
Partner - Creator III

Hi KanT,

 

Please try below.

0683p000009Lx3E.png0683p000009Lwwd.png0683p000009LwqV.png0683p000009LwzS.png

Regards,

Anonymous
Not applicable
Author

Hi Vboppudi,

 

This works great! Thanks a  lot!

One small question, I have 10 more columns in my table. Now my output file populates all the other columns as blank. Where do I make the change so that all additional columns are populated.

 

Ex.

ID                           Status             Stage                                  Name                  Resource

000xnT2eYUTE     Active              To be billed                        ABC                     Htytm

000xnT2eYUTE     Close               billed                                  ABC                     Htytm

000xnT2eYUTE     Close               billed                                  ABC                     Htytm

000xnT2eY12E     InActive            billed                                  DRF                      uuyuu

000xnT2eABCD    Inprogress        waived                               GGG                     yuyuy

000xnT2eABCD    Close               billed                                   GGG                     yuyuy

000xnT2eABCD    Active              To be billed                           GGG                     yuyuy

000xnT2e123D    Active              To be billed                            AAA                     ggsgs

 

Output will be:

000xnT2eYUTE     Active              To be billed                   ABC                     Htytm

 000xnT2eY12E     InActive            billed                             DRF                      uuyuu

000xnT2eABCD    Active              To be billed                    GGG                     yuyuy

000xnT2e123D    Active              To be billed                  AAA                     ggsgs

 

should I add these all these columns In my 'taggregate row' operations..?

vboppudi
Partner - Creator III
Partner - Creator III

You have to add required columns from source to all components. in Aggregate select first for all columns.

Regards,

manodwhb
Champion II
Champion II

In tAggregaterow,you need need gorup by all input colonnade.