Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a file which have multiple delimiters in it and want to separate columns on those basis, like firtst delimiter is 7 bytes,second is ";DNIS=", third is ";ANI=", fourth is ";MACHINEID=" and so on. In Ab Initio, I can deferentiate, each column on the basis of different delimiter, but here I am stuck. Can someone tell how can i read this file in Talend.
Thanks in advance.
Hi,
Could you please provide 3 or 4 sample input records and your expected output for these records? Then we can build a solution around it.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Thanks Nikhil to look into this, we did this using substring or replacing all the delimiters into the same delimiter.
Records were like 1234567manish;DNIS=mishra;ANI=xyz;MACHINEID=abc;CHANNELID=def and output was first_column=1234567, second_column=manish, third_column=mishra, fourth_column=xyz, fifth_column=abc, sixth_column= def
let me know if you have any other solution.
Having same delimiter is the right approach in this case as it reduces the confusion in the layout.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
I also thought, but the first delimiter is 7 bytes, so have to break in 2 parts and then combine, Is there any way I can assign the same delimiter after 7 bytes in the same component.
Hi,
As you mentioned are all the columns would change everytime if it doesn't change you can use right or left functions and substrings
StringHandling.LEFT("1234567manish",7)
StringHandling.RIGHT("DNIS=mishra",6)
StringHandling.RIGHT("ANI=xyz",3)
StringHandling.RIGHT("MACHINEID=abc",3)
StringHandling.RIGHT("CHANNELID=def",3)
I am not very clear with the question but with this a little idea would help you
While we can do all sort of patch work after getting the message in jumbled format, the ideal solution is to push the source to provide data in a simple readable format? It can be any of the established file formats like csv, JSON, XML, avro etc.
Why the source system is current sending the data in this unorthodox format? Is there any specific reason for that?
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Ok. In that case, you will have to write custom Java string parsing functions inside a tJava or tJavarow or routine to handle your specific scenario.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂