Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
kavita02
Contributor
Contributor

Reading specific columns from delimited file

I am reading 15 GB input file in talend, which has 200, "|" delimited fields(columns) out of which I need to use 5 random fields.

To use these 5 random fields, I am read whole 15 GB file with 200 columns using tFileInputDelimited component then I filter  unwanted 195 columns using tFilterColumns component, which is time consuming process(It takes approx 4 to 5 mins to read whole 15 GB File).

Can anyone of you please suggest if there is any other alternative way for implementing this.

More specifically is there any way to read only specific fields from delimited file.

Labels (4)
3 Replies
akumar2301
Specialist II
Specialist II

try tfileinputregex

 

https://help.talend.com/reader/KxVIhxtXBBFymmkkWJ~O4Q/CxhF82OjiQKwpRJriKB6~g

 

try to read data in stream mode. thiscould improve performance.

 

and set parallelize ( if you have subscription version)

https://help.talend.com/reader/TKUQ4WRBbYZRnl9OyAgr5w/cSnwqkJCdsct_heLy3lrAQ

vapukov
Master II
Master II

I'm afraid - in this case, you cannot improve time hardly.

 

delimited format mean read file row by row, even if you need few columns - you must read row 

with an average disk (not NVMe) simple read will take 2+ minutes for 15Gb file

plus some time for parse/filter

kavita02
Contributor
Contributor
Author

Not sure just confirming, As per my understanding when we are reading delimited file using tFileInputDelimited, it will read data row by row  and it will create objects for each field of it according to its type. Correct me if I am wrong