Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
How to remove duplicate values in the file.
Example:
date barcode
01-05-2016 123
02-05-2016 456
03-06-2016 789
05-06-2016 123
09-04-2016 456
output:
date barcode
01-05-2016 123
09-04-2016 456
03-06-2016 789
its an urgent requirement can u pls help me......
Regards,
Ravi.
May be like this:
Data:
Load * inline [
date, barcode
01-05-2016, 123
02-05-2016, 456
03-06-2016, 789
05-06-2016, 123
09-04-2016, 456
];
NoConcatenate
Final:
Load *
Where Count =1;
Load
date,
barcode,
If(barcode<>Previous(barcode),1,0) as Count
Resident Data
Order by barcode;
Drop Table Data;
As I understand you have loaded a table:
date barcode
01-05-2016 123
02-05-2016 456
03-06-2016 789
05-06-2016 123
09-04-2016 456
You wish to have a result like this:
date barcode
01-05-2016 123
09-04-2016 456
03-06-2016 789
Am I right?
UPDATE: what is the logic while picking up the dates?
Ravi you need to provide us with the logic to select those dates. Are you selecting Max(Date) for a particular barcode or a minimum date for a particular date? first value or last value?
Also, provide the information about your date, is it MM-DD-YYYY or DD-MM-YYYY. The dates you have provided can be in any format. It would help to add a date like 21-06-2016 or 06-21-2016 when providing a sample, so that we can differentiate between the format
Hi,
maybe helpful:
table1:
LOAD * INLINE [
date, barcode
01-05-2016, 123
05-06-2016, 123
09-04-2016, 456
02-05-2016, 456
03-06-2016, 789
]
Where not Exists(barcode);
regards
Marco
Exact my problem is......
I have data like this......1000 records
sample data:
Barcode create_date prepared _By
123 24-04-2016 ravi
124 24-04-2016 teja
123 29--04- 2016 ---
345 26-04-2016 ----
346 26-04-2016 ---
345 29-04-2016 ---
I want these outputs:
0utput1:
Barcode create date prepared _By
123 24-04-2016 ravi
124 24-04-2016 teja
345 26-04-2016 ----
346 26-04-2016 ---
In above output using where condition prepared _By =''--' output2 will come no problem.
0utput2:
Barcode create date prepared _By
345 26-04-2016 ----
346 26-04-2016 ---
my main problem is output1 is not getting perfectly while using Thirumal script....its workinh some records its removing fst record and some times second records its removing.....In duplicate want remove only seconds time reapted records.
regards,
Ravi
Have you tried already Marco's suggestion using
WHERE NOT EXISTS(Barcode);
?
Please note that this will not reproduce your original sample:
date barcode
01-05-2016 123
02-05-2016 456 // why not pick this record?
03-06-2016 789
05-06-2016 123
09-04-2016 456
output:
date barcode
01-05-2016 123
09-04-2016 456
03-06-2016 789
The below script will remove the second time repeated records:
Data:
Load * inline [
date, barcode
01-05-2016, 123
02-05-2016, 456
03-06-2016, 789
05-06-2016, 123
09-04-2016, 456
];
NoConcatenate
Final:
Load *
Where Count =1;
Load
barcode,
date,
If(barcode<>Previous(barcode),1,0) as Count
Resident Data
Order by barcode,date;
Drop Table Data;
Hope this is what you are looking for.
The below script is based on your new data:
Data:
Load * Inline [
Barcode, create_date, prepared_By
123, 24-04-2016, ravi
124, 24-04-2016, teja
123, 29-04-2016,
345, 26-04-2016,
346, 26-04-2016,
345, 29-04-2016,
];
NoConcatenate
Final:
Load *
Where Count =1;
Load
Barcode,
create_date,
prepared_By,
If(Barcode<>Previous(Barcode),1,0) as Count
Resident Data
Order by Barcode,create_date;
Drop Table Data;
Thank you Now its working fine...