Qlik Community

New to Qlik Sense

Discussion board where members can get started with Qlik Sense.

dgoel1991
New Contributor II

To remove duplicates in two column

Hi Everyone,

I have a requirement which says to remove the duplicates from the column - "Bill to Customer Name and Customer Part Number" and then load the data. Attach is the excel sheet.

Thanks in advance!

Thanks and Regards,

Drishti Goel

Tags (1)
10 Replies
MVP
MVP

Re: To remove duplicates in two column

Can you give an example of which rows are duplicate and why?

dgoel1991
New Contributor II

Re: To remove duplicates in two column

So, if you filter with Quote Number 21179637 you can see the duplicate data as this quote has been quoted multiple times but it was won only one time. So, Now I have remove the duplicates from the column Bill to Customer Name and Customer Part number in order to remove duplicate entry.

Thanks and Regards,

Drishti Goel

dgoel1991
New Contributor II

Re: To remove duplicates in two column

Any luck ?

dgoel1991
New Contributor II

Re: To remove duplicates in two column

I tried writing the below code, but its not working for me

[Sheet1]:

LOAD DISTINCT *

FROM [lib://Documents/ODS.SIS.MKT.00019 - Open Quote Details (1).xlsx]

(ooxml, embedded labels, table is Sheet1) WHERE Exists([Customer Part Number],[Bill To Customer Name]);

Employee
Employee

Re: To remove duplicates in two column

Hi Drishti,

From your earlier comment, I filtered on the Quote Number 21179637 and noticed that in the Marketing Status field there is an indication of QLRC which could appears to be the one.  By filtering on this data, you reduce the rows from 55,384 to 7,395.

1507730.PNG

This is the load script:

[Sheet1]:

LOAD DISTINCT *

FROM [lib://Documents/ODS.SIS.MKT.00019 - Open Quote Details (1).xlsx]

(ooxml, embedded labels, table is Sheet1)

WHERE [Marketing Status]='QLRC';

Not sure if this is your answer or not.

The problem with your previous solution is that the where exists is looking for data already loaded in the script.

dgoel1991
New Contributor II

Re: To remove duplicates in two column

Hi Lisa,

Thanks for your reply.

I guess that every row has different values and if I'm putting distinct then its gonna fetch all the 6 rows. Ideally I cannot take a filter on Marketing Status because from the business point of view we can won the status even if it is in 'Draft'. So here in this case, ideally I should not use distinct. But my actual goal is to load the data and put the filter on ([Customer Part Number],[Bill To Customer Name]) so that I can remove the duplicated data. How can I achieve this?

Thanks and Regards,

Drishti Goel

dgoel1991
New Contributor II

Re: To remove duplicates in two column

Any luck ?

Employee
Employee

Re: To remove duplicates in two column

Hi Drishti,

I still do not understand your data well enough to work this out.  Looking at the fields every row has a [Bill to Customer Name], but many rows do not have [Customer Part Number], and I can't see any relationship with what was sold.  There is still repetition.

Regards,

Lisa

bwisenosimenkg
Valued Contributor

Re: To remove duplicates in two column

Table:

load * from

[lib://Documents/ODS.SIS.MKT.00019 - Open Quote Details (1).xlsx]

(ooxml, embedded labels, table is Sheet1)

where (not exists([Bill to Customer Name] &'|'& [Customer Part Number]));


//(not exists()) Will be true for the first row read, and then false for the second row read for each combination of the two fields.