Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I am using data from excel in which City column is in below format,
Ajmer
Ajmer.
Bareliu.P.
Bareli U.P
Distt. Kullu H.P.
Distt. Kullu (H.P.)
....
...
....
How can I remove duplicates from the column.?
Hi,
Use match function.
Regards
ASHFAQ
I would use ApplyMap to clean data. To do this, you need a table listing the faulty entries and how these should be corrected:
Map:
Mapping Load * Inline
[From, To
Ajmer., Ajmer
Bareli U.P, Bareliu.P.
Distt. Kullu (H.P.), Distt. Kullu H.P. ];
Then you can use this when loading the data:
ApplyMap('Map', City) as City,
HIC
Hi Ashfaq,
There are almost more than 1000 records, so cant use match function.
Hi Henric,
Thanks for ur reply, but I am having more than 1000 records . Do you have any better solution?
Hi Rajeshree
ReplaceMap:
MAPPING LOAD * INLINE [
char replace
)
(
"
,
/
-
.
] (delimiter is ' ')
;
String:
Load
*,
MapSubString('ReplaceMap', City) as Replaced_String
;
Load * Inline [
City
Ajmer
Ajmer.
Bareliu.P.
Bareli U.P
Distt. Kullu H.P.
Distt. Kullu (H.P.)
];
Instead of this Inline - "City", load the corresponding column from the source and try.
Vijit
Hi Vijit
Thank you for response . Will try this and get back to you incase of more queries
Best Regards
Your Ex-TeamMate
Hi,
I believe this should work for you! Test once after loading data.
Test:
LOAD * INLINE [
Name
Ajmer
Ajmer.
Bareliu.P.
Bareli U.P
Distt. Kullu H.P.
Distt. Kullu (H.P.)
];
New:
NoConcatenate
Load Name,Peek(Name) as NewName, if(WildMatch(Purgechar(Name,'(.)-_ '),'*'&Purgechar(Peek(Name),'(.)-_ ')&'*')>0,1,0) as Filter
Resident Test
order by Name asc
;
Final:
NoConcatenate
Load NewName
Resident New
Where Filter=1;
Drop Table New;
Drop Table Test;
Regards,
Prabhu