Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Remove duplicates

Hi Geniuses,

I have various columns in table but I am facing problem with three columns

Below is the small dataset of big problem

SupplierCode SupplierName InvoiceNo
A02AMPTRON654789
A0003A HARTRODT (S) LOGISTICS PTE LTD1305115
A0003A HARTRODT S'PORE PTE LTD1305115
A0003A HARTRODT SINGAPORE PTE LTD1305115
A03ASEC ASIAPIN010329
A03ASEC ASIAPIN010332
A03ASEC ASIAPIN010335
A03ASEC ASIAPIN010341
A03ASEC ASIAPIN010347
A03ASEC ASIAPIN010349
A03ASEC ASIAPIN010352
A03ASEC ASIAPIN010353
A03ASEC ASIAPIN010358
A03ASEC ASIAPIN010364

If you see here for the supplier A003 there are like common names I want only one name out of the 3 but for the supplier A03 as the invoices are differenrt so I wont have problem there

So in nutshell wherever there is the same supplier code and invoice number pick only one supplier name or you can say pick only one row.

Thanks,

Shashank

10 Replies
sujeetsingh
Master III
Master III

See this sample here it is

Anonymous
Not applicable
Author

Were is the sample sir?

Not applicable
Author

Hi Shashank, You can remove there duplicates by "Exists" function.

If the Invoice is unique among these three fields then try like below:

Load SupplierCode , SupplierName , InVoiceNo

From sourcetable

where not Exists(InvoiceNo) ;

If the Invoice No is same for different SupplierCode then try like below:

Load * where not exists(Key);

Load SupplierCode , SupplierName , InVoiceNo , SupplierCode & InVoiceNo AS Key

From sourcetable ;

Please find the attached file for reference.

Anonymous
Not applicable
Author

My problem is not the exact names my problem is similar looking names like

A0003A HARTRODT (S) LOGISTICS PTE LTD1305115
A0003A HARTRODT S'PORE PTE LTD1305115
A0003A HARTRODT SINGAPORE PTE LTD1305115

Thanks,

Shashank

pandiarajan
Creator
Creator

Dear Shashank,

         This is my suggestion

LOAD SupplierCode,

     MinString(SupplierName) as SupplierName,

     InvoiceNo

FROM [http://community.qlik.com/thread/107248] (html, codepage is 1252, embedded labels, table is @1)

Group By SupplierCode,InvoiceNo;

Anonymous
Not applicable
Author

Hey Dathu,

I see that it works but why does "Where not exist" exclude the other records in suppliername?

jagan
Luminary Alumni
Luminary Alumni

Hi,

Try this script

Data:

LOAD SupplierCode,

     MaxString(SupplierName) AS SupplierName,

     InvoiceNo

FROM [http://community.qlik.com/thread/107248] (html, codepage is 1252, embedded labels, table is @1)

GROUP BY SupplierCode, InvoiceNo;

Hope it helps you.

Regards,

Jagan.

Not applicable
Author

Hi Michael, the Exists function will works in the same table as well. Its like distinct load on the KEY field. If the any expression is not mentioned the field should be loaded from either previous load. In this case the previous load field assumed from same table.

Not applicable
Author

Hi

i would say load first your table without supplier names

then maintain another one with supplier names and codes like

LOAD code, name from table where not exist(code)

BEWARE code field has to NEVER be loaded before even in an other table under same name

ELSE start by loading this table first

Best regards

Chris