Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi everyone,
I have been struggling to find a solution for this. So I have a csv file in UTF-8 encoding with French letters (ô,â,à etc) that is getting replaced as
bientôt replaced as bientÃŽt
été replaced as été
and even ' is replaced by â
But this is not what is expected in the output(DB2). I want these characters to be written as such.
Can someone please suggest me a way to do this?
Thanks in advance
Hi,
I believe VARGRAPHIC will be the right target datatype but please consult your DB2 DBAs for details.
https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/sqlref/src/tpc/db2z_bif_vargraphic.html
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi,
After reading from input file, could you please print the data and see whether Talend is able to print it in correct format?
Since you have already selected the language set as UTF-8, it should print properly.
So the next step is to verify the target columns in DB2 whether it can handle the letters other than English. Could you please share these details with relevant screenshots for further analysis?
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi @nthampi,
Thanks for your reply.
When I try to print the csv data in tlogfile, it print all the characters as required.
But while loading into db2, with UTF-8 as encoding in advanced settings in tFileinputDelimited component , all characters are fine now except ô.
bientôt is being printed as bientt now.
Please let me know what screenshots are needed from my side
Since you are able to print the data correctly, Talend is able to read and transmit UNICODE in right manner.
But when Talend is trying to insert the data to DB2, the current setup in DB2 database seems to not allowing to insert the characters in right format.
Could you please refer this issue to your DB2 DBAs and they will help into the internationalization part for the database and tables?
I have added the IBM Guide for the same as a starting point.
https://www.ibm.com/support/knowledgecenter/SSEPEK/pdf/db2z_10_charbook.pdf
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi,
I believe VARGRAPHIC will be the right target datatype but please consult your DB2 DBAs for details.
https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/sqlref/src/tpc/db2z_bif_vargraphic.html
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Thanks @nthampi for those suggestion and your quick response.
Hi @nthampi,
Im now trying to load the data into vargraphic columns in db2 just to see if it would work but couldn't figure out how to cast the string columns in csv to vargraphic or graphic to be able to load into db2.
Can you please provide me a solution for this?
Thanks !!
Hi,
Could you please try to change the database data type as below in the tDB2Output Schema? I could not verify it personally as I do not have a DB2 database handy with me.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi @nthampi,
Thanks for getting back.
I solved it by loading it as string but by increasing the bytes I had alloted for each column. Don't understand how it solved my problem but it did.