Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

To remove duplicate rows from two column

Hi Everyone,

I have a requirement which says to remove the duplicates from the column - "Bill to Customer Name and Customer Part Number" and then load the data.  I tried to wrote the below code but it didn't work form. Can you help me with the code. Attach is the excel sheet.

[Sheet1]:

LOAD Distinct *

FROM [lib://Documents/ODS.SIS.MKT.00019 - Open Quote Details (1).xlsx]

(ooxml, embedded labels, table is Sheet1) Where NOT Exists([Customer Part Number],[Bill To Customer Name]);

Thanks in advance!

Thanks and Regards,

Drishti Goel

3 Replies
felipedl
Partner - Specialist III
Partner - Specialist III

Hi Drishti,

Your data isn't duplicated for the field you mentioned, each line has at least one different column for the fields you have on the spreadsheet.

Your script is trying to check the duplicates for the entire line and the not exists, as you've put it, won't have any effect on your data.

Let's say you get the following fields from your file:

          

BranchBill To Customer EB NameBill To NoBill To Customer NameEnd Customer NumberEnd Customer NameISR NameFSR NameQuote Header Creation DateQuote Expiration DateQuote NumberTypeCustomer Part NumberEqual lines
DenverUS_CA_Halifax_Advantage_Denver2171172Vais TechnologyPeter-Paul, MelanieMarsh, Millie04/01/201803/02/201821103557Firm1
DenverUS_CA_Halifax_Advantage_Denver2171172Vais TechnologyPeter-Paul, MelanieMarsh, Millie04/01/201803/02/201821103557Firm1
DenverUS_CA_Halifax_Advantage_Denver2171172Vais TechnologyPeter-Paul, MelanieMarsh, Millie30/01/201801/03/201821177545Firm2
DenverUS_CA_Halifax_Advantage_Denver2171172Vais TechnologyPeter-Paul, MelanieMarsh, Millie30/01/201801/03/201821177545Firm2
DenverUS_CA_Halifax_Advantage_Denver2171172Vais TechnologyPeter-Paul, MelanieMarsh, Millie30/01/201801/03/201821180604Firm3
DenverUS_CA_Halifax_Advantage_Denver2171172Vais TechnologyPeter-Paul, MelanieMarsh, Millie30/01/201801/03/201821180604Firm3

The distinct will get, from the 6 rows above, only 3 (three) lines.

The rest of your data for this 6 rows is differnent for each row in the spreadsheet, so to get the distinct values, you'll have to specify some fields, as I did above.

Felipe.

Anonymous
Not applicable
Author

Hi Felip,

Thanks for your reply.

I understand that every row has different values and if I'm putting distinct then its gonna fetch all the 6 rows. So here in this case, ideally I should not use distinct. But my actual goal is to load the data and put the filter on ([Customer Part Number],[Bill To Customer Name]) so that I can remove the duplicated data. How can I achieve this?

Thanks and Regards,

Drishti Goel

Anonymous
Not applicable
Author

Any lead on the solution will be highly appreciated