
Anonymous
Not applicable
2014-03-03
04:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
file encoding ISO8859-1 gives? UTF-8 or "Non-ISO extended-ASCII text"
hello fellow Talend'ers,
i have come across this cirscuntance where i am gathering data from a DB and writing into a flat file (delimited).
the jobs are as plain as that.
nothing fancy at all.
that is wtih the exception of the chosen DELIMITER of these files.
the company that is consuming the output has decided to choose the DELIMITER character to be:
º
this is a Masculine ordinal
http://www.ic.unicamp.br/~stolfi/EXPORT/www/ISO-8859-1-Encoding.html
the trouble that i am facing is that files are being interpreted as UTF-8 - eventhough it is defined as ISO-8859-1.
any suggestions on how this issue may have arised, and how it could be corrected.
Many thanks,
Nicolas
i have come across this cirscuntance where i am gathering data from a DB and writing into a flat file (delimited).
the jobs are as plain as that.
nothing fancy at all.
that is wtih the exception of the chosen DELIMITER of these files.
the company that is consuming the output has decided to choose the DELIMITER character to be:
º
this is a Masculine ordinal
http://www.ic.unicamp.br/~stolfi/EXPORT/www/ISO-8859-1-Encoding.html
the trouble that i am facing is that files are being interpreted as UTF-8 - eventhough it is defined as ISO-8859-1.
any suggestions on how this issue may have arised, and how it could be corrected.
Many thanks,
Nicolas
571 Views
6 Replies

Contributor III
2014-03-03
09:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Check whether you have specified the proper encoding in the "Advanced setting" tab of both the input and output component.
If you parse an ISO file in UTF format, then even if you set the output as ISO, the output file will not contain the ISO characters, instead only question marks will come.
Hope this helps..
Note: Make sure the input and output files are in ISO standard as well...
Also make sure, you are not overwriting an existing output file. Please try to create a new output file and check.
If you parse an ISO file in UTF format, then even if you set the output as ISO, the output file will not contain the ISO characters, instead only question marks will come.
Hope this helps..
Note: Make sure the input and output files are in ISO standard as well...
Also make sure, you are not overwriting an existing output file. Please try to create a new output file and check.
571 Views

Anonymous
Not applicable
2014-03-06
10:26 AM
Author
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
kinda of right..
it was the act of copying and pasting from an UTF-8 file into an ANSI file.
** here is the catch **
the character pasted was an UTF-8.
since this ANSI file was used as a configuration file and the UTF-8 character was the value of one of my variable (The Delimiter for my extracts).
this problem made all my extracts to become corrupted.
where i should find a single character delimiter in the extracts;
i had 2 characteres ..
took me a while to trace this error - but i hope this will help others in avoiding this in future.
it was the act of copying and pasting from an UTF-8 file into an ANSI file.
** here is the catch **
the character pasted was an UTF-8.
since this ANSI file was used as a configuration file and the UTF-8 character was the value of one of my variable (The Delimiter for my extracts).
this problem made all my extracts to become corrupted.
where i should find a single character delimiter in the extracts;
i had 2 characteres ..
took me a while to trace this error - but i hope this will help others in avoiding this in future.
571 Views

Contributor III
2014-03-06
10:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great...
I think we should have a forum to have the following
Nice to do, Best practices, Things to be taken care / avoided etc
so that we can share the experiences over there, which would avoid multiple posts being created for the same topic...
One of the few things I am noticing in the past few days are parsing excel file, playing around with date format.. I see lot of posts asking the same questions but in different way...
I think we should have a forum to have the following
Nice to do, Best practices, Things to be taken care / avoided etc
so that we can share the experiences over there, which would avoid multiple posts being created for the same topic...
One of the few things I am noticing in the past few days are parsing excel file, playing around with date format.. I see lot of posts asking the same questions but in different way...
571 Views

Contributor III
2014-03-06
11:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please mark it as resolved, if it worked out...
571 Views

Anonymous
Not applicable
2014-03-06
11:33 AM
Author
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great...
I think we should have a forum to have the following
Nice to do, Best practices, Things to be taken care / avoided etc
so that we can share the experiences over there, which would avoid multiple posts being created for the same topic...
One of the few things I am noticing in the past few days are parsing excel file, playing around with date format.. I see lot of posts asking the same questions but in different way...
Thanks for your suggestion.
Often best practices are shared via the resolution of an issue, not really in an intentional way.
But this is an interesting suggestion, I'll think of it with the talend community team. Watch out.
571 Views

Anonymous
Not applicable
2014-03-30
11:50 PM
Author
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All
We have open a new forum Talend Data Integration Best Practices, where you can share your best practices about job design, code reusability, error management, and the things to taken care. Feel free to write and share your experiences and story about Talend development in this forum.
http://www.talendforge.org/forum/viewforum.php?id=40
Best regards
Shong
We have open a new forum Talend Data Integration Best Practices, where you can share your best practices about job design, code reusability, error management, and the things to taken care. Feel free to write and share your experiences and story about Talend development in this forum.
http://www.talendforge.org/forum/viewforum.php?id=40
Best regards
Shong
571 Views
