Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I get data in the below format in txt file
"a","b","c","d","e","f,g,h,i,j,k","l","m","n","o","p","q","r","s"
We want "f,g,h,i,j,k" to be populated as one column to the next component. When we are using "," separator, it is producing many lines. In this "f,g,h,i,j,k", there are number of random commas. How do we get "f,g,h,i,j,k" as one column to the next components?
Your response would appreciated.
Hello, are you using the Talend Data Streams AMI? I gave this a quick check and it returned one record with 14 fields for your given example.
To be clear, CSV processing is not entirely well-documented. It should be the format from https://tools.ietf.org/html/rfc4180 except that record delimiters are not permitted inside quotes. Is the comma a field delimiter or record delimiter? This is an important note: most big data text files forbid the use of record delimiters inside fields (even with quotes), since it makes the file unsplittable across nodes.
We have work in progress to add configurable quote enclosures. Does your use case require record delimiters inside quotes? In this case, would it be acceptable if each file was unsplittable?
@rskraba, could you please provide me the code/job here?