Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
RSvebeck
Specialist
Specialist

Find customers with duplicate invoice ID

Hi
I previously posted a question about this, but the example data in my previous quiestion was not complex enough. So I do new post with a better example.

I need to make a list of all customers who have duplicate invoice ID's with other customers.

InvoiceID CustomerID
101 A
102 A
103 A
104 A
501 A
201 B
202 B
203 B
204 B
101 C
301 C
302 C
303 C
304 C
501 C
101 E
501 E
101 F
501 F
900 G
901 G
301 K


What I want : A list that contains customerID = A,C,E,F,K

How do I do it - with a expression? I know how to do it in the load script.

I attached an more "explenational example".


brg Robert

Svebeck Consulting AB
7 Replies
erichshiino
Partner - Master
Partner - Master

Hi,

I'm not sure if this will be enough, but in the first pivot ( 'Not a complete solution '), you can remove the second dimension and add the following expression:

concat ( DISTINCT CustomerID, ', ')

The result will be the one you described in your post: customerID = A,C,E,F,K

Hope this helps,

Erich

RSvebeck
Specialist
Specialist
Author

Hi. Thanks for your suggestion.

However this is not what I want. My post was misleading ( I don't want a list like A,C,E in a row)

I want a list where customeID is the only (visible) column. (as descibed in the docuement)

The reason is that in me "real world" data, there are 10'000 customers and 100'000 invoices, and we can see that there are duplicates in invoice ID, which makes a merge not possible. So I want to list all customers that have invoice numbers that are duplicates (or actually I need a list of the opposite, customers which have no duplicate invoices.

Simply  - a list all customersID's where there is a duplicate invoice. And only list a customer ID once (even if that custumer has several duplicate invoices).

//Robert

Svebeck Consulting AB
RSvebeck
Specialist
Specialist
Author

Just noticed that if I create a chart with CustomerID as the only dimension and this expression
aggr(count(CustomerID), InvoiceID)

I end up getting a list of customerID's that has no duplicate invoices (which is kinda what I want). It seems to be the solution to my request, but I don't really understand why and how it works... and I still need the opposite list -> List of customers who have duplicate invoices with other customers...  //Robert
Svebeck Consulting AB
SunilChauhan
Champion
Champion

see the attached file

hope this helps

Sunil Chauhan
RSvebeck
Specialist
Specialist
Author

Hi. Did you post the right document? The one you attached displays all customerID's in a chart. //Robert

Svebeck Consulting AB
SunilChauhan
Champion
Champion

sorry

see the attached file

hope this helps

Sunil Chauhan
Not applicable

Create a listbox for InvoiceID if you don't already have one. Click on the titlebar of that listbox to select it. Now, type this:   =count(distinct CustomerID)>1       and hit <enter>. That will select all of the InvoiceID values belonging to more than one customer. At this point your chart may be the way you want it.

If you want to select those customers go to a listbox for the CustomerID, right-click, and choose Select Possible   and it will have selected the Customer who share the same InvoiceID. At that point, you may unselect the InvoiceID if necessary.