Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
See why IDC MarketScape names Qlik a 2025 Leader! Read more
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Change encoding in ESB route (UTF-8 to Windows-1252)

Hello everybody,
This is my first message on the forum 0683p000009MACn.png.
I want to convert a file from the "UTF-8" format to the "windows-1252"/"Cp1252" format in Talend route.
I tested the best solution for me.
- Start component: cFtp
I indicate the "utf-8" charset in the advanced options, the charset of my file.
- Middle component: cConvertBody
I indicate the class: "byte[].class, "Cp1252""
- End component: CFtp
I indicate the "Cp1252" charset, the encoding in which I want my file.
This method doesn't work and i'm little desperate. Do you have an idea to help me ?
Thank you in advance.
PS: I included in the attach documents, the options of my components.
0683p000009ME3e.jpg  0683p000009MDxX.jpg 0683p000009ME3j.jpg 
Labels (2)
4 Replies
Anonymous
Not applicable
Author

I have not done this before, but was interested in your problem. I don't believe you will be able to do this in the way you are trying (however, I may be wrong). What I would attempt is making use of a cProcessor component and trying to do the conversion in Java.
Take a look at this site ( http://java67.blogspot.co.uk/2015/05/how-to-convert-byte-array-to-string-in-java-example.html) for an example of how to convert a byte[] to a String of a particular encoding.
However, before you do that, you need to get the data as a byte[]. 
A byte is a primitive type in Java. It is not a class. Therefore your  byte[].class conversion won't work. You need to convert the type to a String.class. Then the next component should be the cProcessor. Once in the cProcessor you can get hold of your data using code similar to below....
String myString = exchange.getIn().getBody(String.class);

You can then refer to the post below, to convert the String to a byte[] in the cProcessor.
http://stackoverflow.com/questions/18571223/how-to-convert-java-string-into-byte
Then use the post I gave in the first paragraph ( http://java67.blogspot.co.uk/2015/05/how-to-convert-byte-array-to-string-in-java-example.html) to convert the encoding. 
Then use code very similar to below to put your newly converted String back into the body....
exchange.getIn().setBody(myConvertedString);

Then the next component *should* have the converted String in the message.
As I said, I have not tried this, but I suspect that this (or a slight variant on this) logic should work for you.
I'd be interested to hear if it does.
Anonymous
Not applicable
Author

It turns out I may have been wrong in my assertion that you can't do what you want in the way you want.....although the way I suggested should work (....the long way around 🙂 ). 
Anonymous
Not applicable
Author

Hello rhall_2.0,
Thank you very much for you answer !
I tried with your method.
cFtp -> cConvertBody (String.class) -> cProcessor (look below) -> cFtp
0683p000009ME3o.jpg
After this route, the file without punctuations (I am French, punctuations is used) have the "ANSI as UTF-8" format but if I add an "é","è","à".... in the file, it have the "ANSI" format.
The format "ANSI as UTF-8" is (certainly) present because of the correspondence characters between the UTF-8 and the ANSI.
I have doubts about the solution, Do you believe that this is normal? Again thank you for your help
Anonymous
Not applicable
Author

OK, I think we are nearly there. This is slightly more complicated than I had first thought. Take a look at the accepted answer here ( http://stackoverflow.com/questions/28484064/windows-1252-to-utf-8). It seems to make sense. It is doing the reverse of what you are doing, but should be easy enough to get it to do what you want.