Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
pnn44794
Partner - Specialist
Partner - Specialist

How To Remove Duplicates and Count Rows?

Hello:

 

I have a dataset that has duplicate entries in the data for LineItemId and LineItemFactId.

 

As always, any and all help is appreciated.  Thanks in advance.

1 Solution

Accepted Solutions
balar025
Creator III
Creator III

You can take out those field which are unique combination and then can have sum of flag

like

Load distinct Order_id,

ParentLineItemId,

LineItemId,

1 as lineitemidcount

where not isnull(ParentLineItemId)

use sum(lineitemidcount)

View solution in original post

4 Replies
Anonymous
Not applicable

Hi Perry,

Hope the below solution will help you to solve your issue:

Count({<ParentLineItemId-={'-'}>}distinct LineItemId)

Thanks & Regards,

Neelima

balar025
Creator III
Creator III

You can take out those field which are unique combination and then can have sum of flag

like

Load distinct Order_id,

ParentLineItemId,

LineItemId,

1 as lineitemidcount

where not isnull(ParentLineItemId)

use sum(lineitemidcount)

pnn44794
Partner - Specialist
Partner - Specialist
Author

Thank you for the reply Neelima.  Sorry for the delay in responding back to you.  Your solution will definitely give me the count but does not remove the duplicates.  As it turns out, the duplicate problem was due to issues with the data itself.  I was verifying that and that's why it took me awhile to respond.

pnn44794
Partner - Specialist
Partner - Specialist
Author

Hello Ravi:

Sorry for the delay in replying.  As it turns out, there's a problem with the data itself that's causing the duplicates.  Your solution did remove some of the duplicate lines (where ParentLineItemId is null) and I suspect if not for the data issues, your solution would remove any others.  Thanks for your reply and I'll be marking your answer as the correct answer.  Thanks again.