Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
talendtester
Creator III
Creator III

[resolved] What is the best way to remove special characters from an entire file?

I have files that are several GB in size that have weird special characters which cause parsing the files to be problematic.
Is there any easy way to remove all non-letter or non-number characters from the entire file?
Labels (2)
1 Solution

Accepted Solutions
talendtester
Creator III
Creator III
Author

Thanks archenroot and shong!
This worked:
row1.myRow.replaceAll("", "");   

View solution in original post

7 Replies
Anonymous
Not applicable

Using tFileInputFullRow to read each row one by one, remove or replace the special characters in each row and output it to a new file.
Jcs19
Creator II
Creator II

You can also try to re-encode the file
Anonymous
Not applicable

Just extension to solution with tFileInputFullRow...
Connect it to the tJavaRow where you will put:
row1.myRow.replaceAll("", "");

And connect it to the output file. This will leave in file only alphanumeric characters.
Ladislav
Anonymous
Not applicable

Hello 
Could this be used to get rid of characters like ® and ™?
Thanks
Anonymous
Not applicable

That should remove anything what is NOT a-z or A-Z or 0-9
talendtester
Creator III
Creator III
Author

Thanks archenroot and shong!
This worked:
row1.myRow.replaceAll("", "");   
Anonymous
Not applicable

I also found this, works a treat and retains the blank space.
row1.myRow.replaceAll("","");
Thanks