
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Handling special characters
Hi Guys,
I need to transform special characters like "á" . Whenever i read these characters and create a output file it shows "" . Please help how these characters can be handled and it should be loaded into the output. I am stuck badly with this .
Thanks,
Srinath
Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
been able to solve the problem by edit the *.bat file in a notepad and adding -Dfile.encoding=utf8 after the java word,it works. thanks a lot mate

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Try using regex:
row1.theStringYouWishToTransform.replaceAll("[^\\w]", "")
Which means, replace any non-word characters (any character outside from [a-zA-Z_0-9]).
If it doesn't matches with your requirements, you can specifiy the characters to replaced bt yourself:
row1.theStringYouWishToTransform.replaceAll("[àâäéèêëîïôöùûü]", "")
Which means, replace these characters (àâäéèêëîïôöùûü) by nothing.
You just have to complete the list of characters you want to remove.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply.
But my requirement is not to replace special character with empty string. It is to load the special characters to output file/table with same size as input file and data should not get trimmed.
In my case, it is populating special characters as empty string but I want to know how Talend handles special characters.
Thanks

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @srkalakonda,
I had encountered a similiar issue.
make sure your source and target files are with the same encoding.
If you use UTF-8 character encoding this should not occur.
Cheers!
Gatha

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I had a similar issue.
I was sending a message to a soap endpoint and Köln was converted as K?ln. This issue didn't occur from the studio but only when I scheduled the standalone job as a scheduled task.
This is how I fixed it: http://talendhowto.com/2017/09/02/add-encoding-batch-file/

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Usin utf-8 as encodin gpage will solve the problem and in case of latin characters to be existed u have to override the JVM parameters with utf parameter also.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Could you please give details on how did you resolve it. I am still getting ? mark if I get any ' mark in my excel source
thanks
Ravi

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
been able to solve the problem by edit the *.bat file in a notepad and adding -Dfile.encoding=utf8 after the java word,it works. thanks a lot mate
