Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I am reading from input file which is separated by "~"
I have a field/column which has below structure
first_name"~"number
First name contains values like DOS COOPERATIVA "DE" INDIA
I want exact name (with quotes) in output (
DOS COOPERATIVA "DE" INDIA)
I have used text enclosure "\""
Escape char """
But I am getting o/p as only
DOS COOPERATIVA or sometimes it fails.
What else settings do I need to change?
Hi
Can you show us an example file with more data for testing? all fields are around with double quotes?
Regards
Shong
If your field delimiter is ~ you do not need to escape the " (double quotes) in your file content.
Could you show us the settings of the tFileInputDelimited?
hi, below is the screenshot of settings
Hi Shong,
File has below data format
"0000495195ININS"~""~""~""~""~""~"GEN_11"~"ACTIVE"~"2021-05-15"~""~""~"PARTY_5"~"DOS COOPERATIVA "DE" INDIA"~"ABC"
Also this record is getting rejected as it couldnt find ~ after EEE"
"0000495195ININS"~""~""~""~""~""~"GEN_11"~"ACTIVE"~"2021-05-15"~""~""~"PARTY_5"~"DOS COOPERATIVA DE EEE""~"ABC"
OK, Now I see the problem. The text enclosure will be recognised to early. Actually a failure because the text enclosure could only be valid if is direct before a field separator or a line feed.
I would raise a Talend support ticket, to force Talend to fix that. But for the mean time I would preprocess the file and replace all <"~"> with <@~@> and read this file. The @ is only a suggestion, you need a char which does not appears in your content.
Thanks @Jan Lolling
I cannot preprocess the file.
need to find some alternate solution.
Ok, in this case you need to read the whole line (use tFileInputDelimited and set here for field delimiter "") and have only one column (here line). In this example I use the pipe als field delimiter.
Then process the line in a tJavaRow with
output_row.line = input_row.line.replace("\"~\"", "|");
the next component is tExtractDelimitedFields and setup field delimiter with "|" and setup its output schema according to the actually expected schema for this file.
Hi @Jan Lolling
I found an alternative
1)change Text enclosure to ""
it will result in " around string like "DOS COOPERATIVA DE EEE""
2)use substring function
row1.first_name.substring(1, row1.first_name.length()-1)
to remove quotes.
It works but is very long process as I have 200 columns in my file
Talend should provide some alternate solution/functionality to deal this.