Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to remove duplicates in a file

Hi All,

How to remove  duplicate values in the file.

Example:

date                     barcode

01-05-2016          123

02-05-2016           456

03-06-2016          789

05-06-2016          123

09-04-2016           456          

output:

date                     barcode

01-05-2016          123

09-04-2016           456

03-06-2016          789

its an urgent requirement can u pls help me......

Regards,

Ravi.

9 Replies
trdandamudi
Master II
Master II

May be like this:

Data:

Load * inline [

date,               barcode

01-05-2016,          123

02-05-2016,           456

03-06-2016,          789

05-06-2016,          123

09-04-2016,           456  

];

NoConcatenate

Final:

Load *

Where Count =1;

Load

date,

barcode,

If(barcode<>Previous(barcode),1,0) as Count

Resident Data

Order by barcode;

Drop Table Data;

MindaugasBacius
Partner - Specialist III
Partner - Specialist III

As I understand you have loaded a table:

date                     barcode

01-05-2016          123

02-05-2016           456

03-06-2016          789

05-06-2016          123

09-04-2016           456

You wish to have a result like this:

date                     barcode

01-05-2016          123

09-04-2016           456

03-06-2016          789

Am I right?

UPDATE: what is the logic while picking up the dates?

sunny_talwar

Ravi you need to provide us with the logic to select those dates. Are you selecting Max(Date) for a particular barcode or a minimum date for a particular date? first value or last value?

Also, provide the information about your date, is it MM-DD-YYYY or DD-MM-YYYY. The dates you have provided can be in any format. It would help to add a date like 21-06-2016 or 06-21-2016 when providing a sample, so that we can differentiate between the format

MarcoWedel

Hi,

maybe helpful:

QlikCommunity_Thread_215138_Pic1.JPG

table1:

LOAD * INLINE [

    date, barcode

    01-05-2016, 123

    05-06-2016, 123

    09-04-2016, 456

    02-05-2016, 456

    03-06-2016, 789

]

Where not Exists(barcode);

regards

Marco

Not applicable
Author

Exact my problem is......

I have data like this......1000 records

sample data:

Barcode   create_date          prepared _By

123            24-04-2016           ravi

124            24-04-2016           teja

123            29--04- 2016         ---

345            26-04-2016           ----

346            26-04-2016           ---

345            29-04-2016           ---

I want these outputs:

0utput1:

Barcode      create date         prepared _By

123            24-04-2016           ravi

124            24-04-2016           teja

345            26-04-2016           ----

346            26-04-2016           ---

In above output using where condition prepared _By =''--' output2 will come no problem.

0utput2:

Barcode      create date         prepared _By

345            26-04-2016           ----

346            26-04-2016           ---

my main problem is output1 is not getting perfectly while using  Thirumal script....its workinh  some records its removing fst record and  some times second records its removing.....In duplicate  want remove only seconds time reapted records.

regards,

Ravi

swuehl
MVP
MVP

Have you tried already Marco's suggestion using

WHERE NOT EXISTS(Barcode);

?

Please note that this will not reproduce your original sample:

date                     barcode

01-05-2016          123

02-05-2016           456 // why not pick this record?

03-06-2016          789

05-06-2016          123

09-04-2016           456         

output:

date                     barcode

01-05-2016          123

09-04-2016           456

03-06-2016          789

trdandamudi
Master II
Master II

The below script will remove the second time repeated records:

Data:

Load * inline [

date,              barcode

01-05-2016,          123

02-05-2016,          456

03-06-2016,          789

05-06-2016,          123

09-04-2016,          456

];

NoConcatenate

Final:

Load *

Where Count =1;

Load

barcode,

date,

If(barcode<>Previous(barcode),1,0) as Count

Resident Data

Order by barcode,date;

Drop Table Data;

Hope this is what you are looking for.

trdandamudi
Master II
Master II

The below script is based on your new data:

Data:

Load * Inline [

Barcode,   create_date,          prepared_By

123,       24-04-2016,           ravi

124,       24-04-2016,           teja

123,       29-04-2016,        

345,       26-04-2016,          

346,       26-04-2016,          

345,       29-04-2016,          

];

NoConcatenate

Final:

Load *

Where Count =1;

Load

Barcode,

create_date,

prepared_By,

If(Barcode<>Previous(Barcode),1,0) as Count

Resident Data

Order by Barcode,create_date;

Drop Table Data;

Not applicable
Author

Thank you Now its working fine...