Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Forum,
I have a Job that has the below Components;
tFileInputDelimited joined to a tReplace Component and this joins to a tUniqRow Component and this all connects to a tMap Component.
All works OK and I have another use for the data that is produced from the tUniqRow Component.
I want to avoid starting all over again from the tFileInputDelimited Component, so I have been trying various other components to do this.
Some Components like tCreateTable won't connect to a tUniqRow or tReplace in order for me to create another output.
I wonder if there's any Component that will hold the results after the tUniqRow Component within the Job.
I want to avoid saving the data to any location, I need it to be a part of the Job.
Hopefully someone has had this issue before.
Thanks
Hi,
Why don't you store the data in the interim hashmemory as shown below?
You can use the result set multiple times after that?
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
You can use tBufferOutput to hold row results, and tBufferInput to retrieve what was placed into the buffer.
Hi,
If you want to store the data in memory on temporary basis, please try tBufferInput and tBufferOutput components.
https://help.talend.com/reader/wDRBNUuxk629sNcI0dNYaA/ajog62SaP0aL7VH9l9vpog
https://help.talend.com/reader/wDRBNUuxk629sNcI0dNYaA/yh_GCK7WMATxbOSlI_Ck1Q
Please note that you will need memory to hold the data. So please do not use a big data set as it will eat your memory pretty quickly.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Thanks, I've noted the other reply and I will need to use large datasets with this tBufferInput/Output option.
Is there another option(s) to allow me to split the data into more than one stream to use for another purpose?
Thanks for your help.
Thanks for your advice, I do need to use large datasets, so maybe this isn't suitable.
Is there another option to duplicate the dataset or split it into 2 identical streams?
Thanks
Hi,
Why don't you store the data in the interim hashmemory as shown below?
You can use the result set multiple times after that?
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi,
Another method is to use treplicate component to replicate identical datasets for further processing.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Thanks Nikhil,
I couldn't get tReplicate to connect to the tReplace component, but I can use the tHashInput & Output.
Is tHashInput & Output the most efficient choice for Big Data? ie 100's of millions of records?
Thanks
Hi,
If its a big data type of flow, I would recommend to store the interim data set to files rather than in Hash memory.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Thanks Nikhil,
I checked the actual counts & thery're about 10 million records.
Is this OK for Hashmemory to handle?
Thanks