Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have below scenario where I have to remove duplicates. If my ID column has duplicates and then look at the status column. If status column = Active or Inactive, then pick that row and remove other duplicate IDs for that row. I need to look my status column only when there r duplicates IDs. Example:
ID Status Stage
000xnT2eYUTE Active To be billed
000xnT2eYUTE Close billed
000xnT2eYUTE Close billed
000xnT2eABCD Inprogress waived
So I want my output to be:
ID Status Stage
000xnT2eYUTE Active To be billed
000xnT2eABCD Inprogress waived
How do I do it..?
@KanT,find below.
@KanT,you can use tMap instead of tFilterRow and tFiltercoulumns.
Hi Manodwhb,
Thanks for your reply. These screenshots are v helpful. I have few questions though:
1. Instead of tfixed input, I am using Tinputdelimited. I guess it shouldn't impact, but just want to crosscheck with you.
2. If status column = Active or Inactive, then pick that row and remove other duplicate IDs for that row.
Thanks again!
Hi KanT,
So as per my understanding, You need to identify duplicates based on ID. and If you have any duplicates in ID then consider only only one record based on Status. Priority of status will be like Active, Inactive, etc... correct me if i am wrong.
000xnT2eYUTE Active To be billed
000xnT2eYUTE Close billed
000xnT2eYUTE Close billed
000xnT2eY12E InActive billed
000xnT2eABCD Inprogress waived
000xnT2eABCD Close billed
000xnT2eABCD Active To be billed
000xnT2e123D Active To be billed
Output will be:
000xnT2eYUTE Active To be billed
000xnT2eY12E InActive billed
000xnT2eABCD Active To be billed
000xnT2e123D Active To be billed
Is this expected output or any changes?
Regards,
Hi vboppudi,
Your understanding is correct. If there are duplicates, I want to look at my Status column. if Status column = Active or Inactive, then pick that row only and ignore rest duplicates. I will just modify your example a bit:
000xnT2eYUTE Close To be billed
000xnT2eYUTE Active billed
000xnT2eYUTE Close billed
000xnT2eY12E Close billed
000xnT2eABCD Inprogress waived
000xnT2eABCD Close billed
000xnT2eABCD Active To be billed
000xnT2e123D Active To be billed
Output will be:
000xnT2eYUTE Active billed
000xnT2eY12E Close billed (ID here is not duplicate, so it will ignore status field in this case)
000xnT2eABCD Active To be billed
000xnT2e123D Active To be billed
How do I use this logic in taggregaterow? I did as per the screenshot mentioned in the post above but my results are not write.
Hi KanT,
Please try below.
Regards,
Hi Vboppudi,
This works great! Thanks a lot!
One small question, I have 10 more columns in my table. Now my output file populates all the other columns as blank. Where do I make the change so that all additional columns are populated.
Ex.
ID Status Stage Name Resource
000xnT2eYUTE Active To be billed ABC Htytm
000xnT2eYUTE Close billed ABC Htytm
000xnT2eYUTE Close billed ABC Htytm
000xnT2eY12E InActive billed DRF uuyuu
000xnT2eABCD Inprogress waived GGG yuyuy
000xnT2eABCD Close billed GGG yuyuy
000xnT2eABCD Active To be billed GGG yuyuy
000xnT2e123D Active To be billed AAA ggsgs
Output will be:
000xnT2eYUTE Active To be billed ABC Htytm
000xnT2eY12E InActive billed DRF uuyuu
000xnT2eABCD Active To be billed GGG yuyuy
000xnT2e123D Active To be billed AAA ggsgs
should I add these all these columns In my 'taggregate row' operations..?
You have to add required columns from source to all components. in Aggregate select first for all columns.
Regards,