Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm reading in a CSV with multiple schemas (header, record) thorugh tFileInputMSPositional, with header rows appearing first in the file. When performing mappings on the record rows i need to have access to the header rows. Tried to accomplish this through tHashOutput/Input and tMap lookup (see image). However the tMap tries to load the lookup on launch and not on use, causing hash not initialized error. How can i achieve this without re-reading the input file?
Another question whilst at it, can you use repository schemas in tFileInputMSPositional/Delimited? I find it weird that i can use repository schema for the non-MS versions but not in the MS ones .
Hi,
You will have to read the file twice since you need both main and lookup details from same file. But if you want to limit the data read for header lookup, its possible by using limit feature.
If the data in limit is -1, it will read full file. But if its specific number of records say 10, it will read only 10 lines.
You can try this method to avoid reading full data twice.
I hope I have answered your query. Please spare a second to mark the topic as answered as it will help other Talend community members during their reference.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi,
Since you are using tHashOutput in same Subjob, you will get error. So either you will have to read the header to Hash in previous SubJob or you will have to read the file again in the lookup section.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Thanks for the reply . If you are given free solution design reins, is there any proper way to resolve reading header and records from a file without reading the files more than once? Or is there some way to start streaming a file and then stop after the header lines have been read? Reason for not wanting to re-read the files is that they are in the 2+GB range.
Hi,
You will have to read the file twice since you need both main and lookup details from same file. But if you want to limit the data read for header lookup, its possible by using limit feature.
If the data in limit is -1, it will read full file. But if its specific number of records say 10, it will read only 10 lines.
You can try this method to avoid reading full data twice.
I hope I have answered your query. Please spare a second to mark the topic as answered as it will help other Talend community members during their reference.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂