Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Handling special characters

Hi Guys,

 

I need to transform special characters like "á" . Whenever i read these characters and create a output file it shows "" . Please help how these characters can be handled and it should be loaded into the output. I am stuck badly with this .

 

Thanks,

Srinath

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

been able to solve the problem by edit the *.bat file in a notepad and adding -Dfile.encoding=utf8 after the java word,it works. thanks a lot mate

View solution in original post

7 Replies
TRF
Champion II
Champion II

Hi,

Try using regex:

row1.theStringYouWishToTransform.replaceAll("[^\\w]", "")

Which means, replace any non-word characters (any character outside from [a-zA-Z_0-9]).

 

If it doesn't matches with your requirements, you can specifiy the characters to replaced bt yourself:

 

row1.theStringYouWishToTransform.replaceAll("[àâäéèêëîïôöùûü]", "")

Which means, replace these characters (àâäéèêëîïôöùûü) by nothing.

You just have to complete the list of characters you want to remove.

 

 

Anonymous
Not applicable
Author

Thanks for the reply.

But my requirement is not to replace special character with empty string. It is to load the special characters to output file/table with same size as input file and data should not get trimmed.

In my case, it is populating special characters as empty string but I want to know how Talend handles special characters. 

 

Thanks

Anonymous
Not applicable
Author

Hi @srkalakonda,

 

I had encountered a similiar issue. 

 

make sure your source and target files are with the same encoding.

 

If you use UTF-8 character encoding this should not occur. 

 

Cheers!

Gatha

rsmits
Contributor
Contributor

Hi,

 

I had a similar issue. 

 

I was sending a message to a soap endpoint and Köln was converted as K?ln. This issue didn't occur from the studio but only when I scheduled the standalone job as a scheduled task.

 

This is how I fixed it: http://talendhowto.com/2017/09/02/add-encoding-batch-file/

 

Anonymous
Not applicable
Author

Usin utf-8 as encodin gpage will solve the problem and in case of latin characters to be existed u have to override the JVM parameters with utf parameter also.

ravir
Contributor
Contributor

Hello,

 

Could you please give details on how did you resolve it. I am still getting mark if I get any ' mark in my excel source

 

thanks

Ravi

Anonymous
Not applicable
Author

been able to solve the problem by edit the *.bat file in a notepad and adding -Dfile.encoding=utf8 after the java word,it works. thanks a lot mate