Formatting data from excel

Report Inappropriate Content · ‎2014-07-02

Hi All,

I am using data from excel in which City column is in below format,

Ajmer

Ajmer.

Bareliu.P.

Bareli U.P

Distt. Kullu H.P.

Distt. Kullu (H.P.)

....

...

....

How can I remove duplicates from the column.?

ashfaq_haseeb · ‎2014-07-02

Hi,

Use match function.

Regards

ASHFAQ

hic · ‎2014-07-02

I would use ApplyMap to clean data. To do this, you need a table listing the faulty entries and how these should be corrected:

Map:

Mapping Load * Inline

[From, To

Ajmer., Ajmer

Bareli U.P, Bareliu.P.

Distt. Kullu (H.P.), Distt. Kullu H.P. ];

Then you can use this when loading the data:

ApplyMap('Map', City) as City,

HIC

Report Inappropriate Content · ‎2014-07-02

Hi Ashfaq,

There are almost more than 1000 records, so cant use match function.

Report Inappropriate Content · ‎2014-07-02

Hi Henric,

Thanks for ur reply, but I am having more than 1000 records . Do you have any better solution?

Anonymous · ‎2014-07-02

Hi Rajeshree

ReplaceMap:

MAPPING LOAD * INLINE [

char replace

)

(

"

,

/

-

.

] (delimiter is ' ')

;

String:

Load

*,

MapSubString('ReplaceMap', City) as Replaced_String

;

Load * Inline [

City

Ajmer

Ajmer.

Bareliu.P.

Bareli U.P

Distt. Kullu H.P.

Distt. Kullu (H.P.)

];

Instead of this Inline - "City", load the corresponding column from the source and try.

Vijit

Report Inappropriate Content · ‎2014-07-02

Hi Vijit

Thank you for response . Will try this and get back to you incase of more queries

Best Regards

Your Ex-TeamMate

Report Inappropriate Content · ‎2014-07-02

Hi,

I believe this should work for you! Test once after loading data.

Test:

LOAD * INLINE [

Name

Ajmer

Ajmer.

Bareliu.P.

Bareli U.P

Distt. Kullu H.P.

Distt. Kullu (H.P.)

];

New:

NoConcatenate

Load Name,Peek(Name) as NewName, if(WildMatch(Purgechar(Name,'(.)-_ '),'*'&Purgechar(Peek(Name),'(.)-_ ')&'*')>0,1,0) as Filter

Resident Test

order by Name asc

;

Final:

NoConcatenate

Load NewName

Resident New

Where Filter=1;

Drop Table New;

Drop Table Test;

Regards,

Prabhu

Related Topics