Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
i am doing something very simple and yet impossible with Talend.
I read in a file in UTF-8 with data delimited by TAB, and i have to write it out as delimited by ¬
this simple character ¬ is getting printed correctly in the output file.
and, no matter how many different file viewers I use - they all print a SQUARE instead of this character ¬
both input and output files of this job are using UTF-8 - could you please provide some guidance to resolve this issue?
thanks,
We are using:
Windows 10
>java -version
java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
It turns out that somebody edited the Talend**.ini
the following was missing:
-Dfile.encoding=UTF-8
The support team gave a really good hint in the right direction.
Thanks,
Hi,
It is not a Talend tool limitation. I added UTF-8 in both input and output components and I was able to print the data in output file. Please refer the screen shot below while opening in Notepad++.
Please also refer the output file component screenshot for your reference.
Please make sure that you have UTF-8 enabled for both input and output components. Could you please export your job for a quick analysis if you are still facing the issue?
Warm Regards,
Nikhil Thampi
I think our configurations are the same for the output component.
is there a direct email that I can send this job to? or can I upload it to the support portal (we have 'platinum support' )
thanks
Quite Interesting 🙂 In between could you please double check the Advanced parameters in tFileinput and tFileOutput? Hope its UTF-8 for both elements.
If there is any confidential information in your job, my suggestion will be to raise a support ticket and the support team members will help you. You can add a link of this community chat page for their quick reference in the support case.
I am not part of the support team and I normally do community posts as a part time hobby 🙂 So it may not come to me. But there are very good people at support and they can help you to fix the issue quickly.
If the answer has helped you, could you please mark the topic as resolved? Kudos are also welcome 🙂
Warm Regards,
Nikhil Thampi
Thanks Nikhil
the encoding is set to "UTF-8" in both components.
and the data is transform midflow - it is very strange.
regards,
Hi,
A quick way to check which component is doing the conversion is to add tlogrow after each component and print the data.
Once the component is identified or the logic which inadvertently changing the data is identified, then it will be very easy to fix the problem.
Warm Regards,
Nikhil Thampi
It turns out that somebody edited the Talend**.ini
the following was missing:
-Dfile.encoding=UTF-8
The support team gave a really good hint in the right direction.
Thanks,