Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a CSV file as in the attached image. One of the columns has LF characters in it. The row seaparator is a LF as well. I have tried using tfileinputdelimited with CSV OPtions checked and the the Text Enclosure as """. tfilenputdelimited seems to ignore the text enclosure property.
Any suggestions on how to parse the file ?
you can test it in notepad++ :
ctrl+h
then in find what : (?<![",])\n
Replace with : the caracter you want
search mode : regular expression
then you click on replace All and you see if it's ok
I think you need to set the escape char to LF rather than ".
You can use regex expression to replace LF if you could identify the LF for separate row from the rest
Thank you for the suggestion. But that does not work.
ok I think you can replace LF not preceded by a double quote or a coma with regex : you read your file as one line as a string (see tFileInputRAw for example) then
(your string).replaceAll("(?<![\",])\\n","the caracter you want instead of LF") in a tjavarow and you save it with tFileOutputRaw.
Then you can read it with LF as row separator.
Thank you ! I was just tryng that manually using notepad++.
I will try this out in Talend.
it could work if your last field is between double quote or empty
you can test it in notepad++ :
ctrl+h
then in find what : (?<![",])\n
Replace with : the caracter you want
search mode : regular expression
then you click on replace All and you see if it's ok
Thank you trying it out. Let you know how it goes.
@gjeremy1617088143 thank you. This worked trying it out with notepad++. Will try it out in Talend.