Hi holberger,
I have come across this problem as well, and I think this is a bug within Talend. You cannot escape the commas properly.
The only options available in the Escape Char and Text Enclosure drop down menus are:
- Empty
- "\""
- "'"
- "\\"
If both the Escape Char and the Text Enclosure are set to "'", in order for you to parse your fields as:
|Pembroke, MA 02359|Pembroke|Plymouth|MA|
You should have the next input string:
,'Pembroke, MA 02359',Pembroke,Plymouth,MA,
But then, if your field needed to use the apostrophe in the data, it would not be properly escaped, for example, the next input would not be properly parsed, even if you added escape characters before the apostrophe:
,'Pembroke's Hills Street, MA 02359',Pembroke,Plymouth,MA,
My conclusion is that the Metadata Wizard cannot be used because it is faulty. However, you can just define a single tFileInputDelimited and check the tickbox "CSV options" (Accepting the defaults for Escape char and Text enclosure). With this configuration your field should be properly parsed from:
,"Pembroke, MA 02359",Pembroke,Plymouth,MA,
to:
|Pembroke, MA 02359|Pembroke|Plymouth|MA|
And you would also be able to escape apostrophes and double quotes, so that this input:
,"Pembroke's Hills is ""the"" house, MA 02359",Pembroke,Plymouth,MA,
would be parsed as:
|Pembroke's Hills is "the" house, MA 02359|Pembroke|Plymouth|MA|
This is a workaround that works, but it would be better if this was possible through the Metadata Wizard.
I have tried this with versions 5.2.2 and 5.1.1, so I guess this is something that has not been tackled yet.
Does anyone have any news regarding this?
Many thanks.