Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
nivedhitha
Creator III
Creator III

Handling special characters(French letters) in csv to load into DB2 as such

Hi everyone,

 

I have been struggling to find a solution for this. So I have a csv file in UTF-8 encoding with French letters (ô,â,à etc) that is getting replaced as

bientôt replaced as bientÃŽt

été replaced as Ã©té

and even ' is replaced by â€

But this is not what is expected in the output(DB2). I want these characters to be written as such.

Can someone please suggest me a way to do this?

 

Thanks in advance

Labels (3)
1 Solution

Accepted Solutions
Anonymous
Not applicable

Hi,

 

   I believe VARGRAPHIC will be the right target datatype but please consult your DB2 DBAs for details.

 

https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/sqlref/src/tpc/db2z_bif_vargraphic.html

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

View solution in original post

8 Replies
Anonymous
Not applicable

Hi,

 

    After reading from input file, could you please print the data and see whether Talend is able to print it in correct format?

 

    Since you have already selected the language set as UTF-8, it should print properly.

 

    So the next step is to verify the target columns in DB2 whether it can handle the letters other than English. Could you please share these details with relevant screenshots for further analysis?

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

nivedhitha
Creator III
Creator III
Author

Hi @nthampi,

 

Thanks for your reply.

When I try to print the csv  data in tlogfile, it print all the characters as required. 

But while loading into db2, with UTF-8 as encoding in advanced settings in tFileinputDelimited component , all characters are fine now except ô.

bientôt is being printed as bientt now.

Please let me know what screenshots are needed from my side

 

 

 

Anonymous
Not applicable

@nivedhitha,

 

    Since you are able to print the data correctly, Talend is able to read and transmit UNICODE in right manner.

 

    But when Talend is trying to insert the data to DB2, the current setup in DB2 database seems to not allowing to insert the characters in right format.

 

    Could you please refer this issue to your DB2 DBAs and they will help into the internationalization part for the database and tables?

 

I have added the IBM Guide for the same as a starting point.

 

https://www.ibm.com/support/knowledgecenter/SSEPEK/pdf/db2z_10_charbook.pdf

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

 

Anonymous
Not applicable

Hi,

 

   I believe VARGRAPHIC will be the right target datatype but please consult your DB2 DBAs for details.

 

https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/sqlref/src/tpc/db2z_bif_vargraphic.html

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

nivedhitha
Creator III
Creator III
Author

Thanks @nthampi for those suggestion and your quick response.

 

nivedhitha
Creator III
Creator III
Author

Hi @nthampi,

 

Im now trying to load the data into vargraphic columns in db2 just to see if it would work but couldn't figure out how to cast the string columns in csv to vargraphic or graphic to be able to load into db2.

 

Can you please provide me a solution for this?

 

Thanks !!

Anonymous
Not applicable

Hi,

 

    Could you please try to change the database data type as below in the tDB2Output Schema? I could not verify it personally as I do not have a DB2 database handy with me.

 

0683p000009M2D5.png

 

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

nivedhitha
Creator III
Creator III
Author

Hi @nthampi,

 

Thanks for getting back.

I solved it by loading it as string but by increasing the bytes I had alloted for each column. Don't understand how it solved my problem but it did.