
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Replace function: Find Multiple strings and replace to blank
As a part of cleaning my Big Data text, I have to replace multiple stings to blank (I have almost 40 characters/strings to replace blank).
I am using
LOAD
...
Replace(‘replace_field','string_A',’’) as ReplacedString,
...
FROM ...;
Can I replace all my strings (example: string_B, String_C, String_D etc) to blank in single command?
- « Previous Replies
- Next Replies »
Accepted Solutions


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
map1:
mapping load * inline [
x, y
happy, ''
dinner, ''
] ;
Directory;
LOAD
SaelsOrder_ID,
SalesOrder_BP,
SalesOrder_Text,
mapsubstring('map1',SalesOrder_Text) as Changed
FROM
[152744.xlsx]
(ooxml, embedded labels, table is Sheet1);

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
PFA


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Single characters can be cleaned using keepchar() or purgechar(), just remember these functions are case sensitive.
Replace() can cleanse strings, you can nest several replace commands together.
For example. Replace(replace(replace(yourfieldname, 'abc',''), 'def',''), 'xyz', '') as cleandata


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Excel data:
Script:
map1:
mapping load * inline [
x, y
Robert, ''
Poland, ''
from,'' ] ;
Directory;
LOAD mapsubstring('map1',String)
FROM
[..\..\book2.xlsx]
(ooxml, embedded labels, table is Sheet1);
Outcome:

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Muralidhar,
Try:
if(match(fieldname,'string_A','string_B','string_C','string_D'),'') as ReplacedString;
or
if(wildmatch(fieldname,'string*'),'') as ReplacedString;
Regards
Neetha

.png)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Try MapSubString() and Mapping Load concepts to achieve this. Refer Robert_Mika post for this. This is the best option if you have around 40 values to be replaced.
Regards,
Jagan.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
this is the script if you have personal edition
// make some test string
Source:
load
chr(Ord('a') + rand()*26)
&chr(Ord('a') + rand()*26)
&chr(Ord('a') + rand()*26)
&chr(Ord('a') + rand()*26)
& chr(Ord('A') + rand()*26)
& chr(Ord('A') + rand()*26)
& chr(Ord('A') + rand()*26) as str
AutoGenerate 100;
Concatenate (Source)
load str inline [
str
aaa
bbb
ccc
ddd
eee
fff
];
// string to replace with blank
MapString:
Mapping LOAD * inline [
from, to
aaa,
bbb,
ccc,
ddd,
eee,
fff,
];
//
Final:
load
*,
// replace with a single command the string in field from of MapString with blank (field to)
ApplyMap('MapString', str) as newstr
Resident
Source;
DROP Table Source;

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@neetha Your command is replacing whole string to blank. But i would like to replace only particular portion of the string.
Example: hello how are you? (I want to replace hello with blank) and my end result will be... how are you?
Is that make sense? please help.


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you check my example?
You can replace multiple strings.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@ Robert
I love to explore this. Can you help me in the scenario if I have more than one field in my data set?
Example:
I have SalesOrder_ID, SalesOrder_BP and SalesOrder_Text (Here I want to change/replace strings only Sales Order Text field)
Thanks for your help.

- « Previous Replies
- Next Replies »