Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

create staging table in Talend

Hi Forum,

I have a Job that has the below Components;

tFileInputDelimited joined to a tReplace Component and this joins to a tUniqRow Component and this all connects to a tMap Component.

All works OK and I have another use for the data that is produced from the tUniqRow Component.

I want to avoid starting all over again from the tFileInputDelimited Component, so I have been trying various other components to do this.

Some Components like tCreateTable won't connect to a tUniqRow or tReplace in order for me to create another output.

I wonder if there's any Component that will hold the results after the tUniqRow Component within the Job.

I want to avoid saving the data to any location, I need it to be a part of the Job.

Hopefully someone has had this issue before.

Thanks   

 

Labels (1)
  • v7.x

1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Hi,

 

    Why don't you store the data in the interim hashmemory as shown below?

0683p000009M4pG.png

 

You can use the result set multiple times after that?

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

View solution in original post

10 Replies
Anonymous
Not applicable
Author

You can use tBufferOutput to hold row results, and tBufferInput to retrieve what was placed into the buffer.

Anonymous
Not applicable
Author

Hi,

 

    If you want to store the data in memory on temporary basis, please try tBufferInput and tBufferOutput components.

 

https://help.talend.com/reader/wDRBNUuxk629sNcI0dNYaA/ajog62SaP0aL7VH9l9vpog

 

https://help.talend.com/reader/wDRBNUuxk629sNcI0dNYaA/yh_GCK7WMATxbOSlI_Ck1Q

 

    Please note that you will need memory to hold the data. So please do not use a big data set as it will eat your memory pretty quickly.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

Anonymous
Not applicable
Author

Thanks, I've noted the other reply and I will need to use large datasets with this tBufferInput/Output option.

Is there another option(s) to allow me to split the data into more than one stream to use for another purpose?

Thanks for your help. 

 

Anonymous
Not applicable
Author

Thanks for your advice, I do need to use large datasets, so maybe this isn't suitable.

Is there another option to duplicate the dataset or split it into 2 identical streams?

Thanks

Anonymous
Not applicable
Author

Hi,

 

    Why don't you store the data in the interim hashmemory as shown below?

0683p000009M4pG.png

 

You can use the result set multiple times after that?

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

Anonymous
Not applicable
Author

Hi,

 

    Another method is to use treplicate component to replicate identical datasets for further processing.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

Anonymous
Not applicable
Author

Thanks Nikhil,

I couldn't get tReplicate to connect to the tReplace component, but I can use the tHashInput & Output.

Is tHashInput & Output the most efficient choice for Big Data? ie 100's of millions of records?

Thanks

Anonymous
Not applicable
Author

Hi,

 

     If its a big data type of flow, I would recommend to store the interim data set to files rather than in Hash memory.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

Anonymous
Not applicable
Author

Thanks Nikhil,

I checked the actual counts & thery're about 10 million records.

Is this OK for Hashmemory to handle?

Thanks