Skip to main content
Announcements
July 15, NEW Customer Portal: Initial launch will improve how you submit Support Cases. IMPORTANT DETAILS
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

tMap reload at each row (cache)

Hello,

 

I have one doubt...

I'm using the tFileInputMSDelimited and two tMaps in my job (for now). In the tMap lookup settings, I have a get_dates component that will retrieve the dates from database. To a better performance of my job the tMap lookup settings is with the option "Reload at each row (cache)" enabled, to avoid call the select statement in database more than one time for the same date. The question is that I'm using the same get_dates component in two different tMaps, so all the dates that was saved to the cache in the first tMap will be available for the second too? Or it create a cache for each tMap?

0683p000009M9aF.png

Thank you!

Labels (2)
6 Replies
TRF
Creator III
Creator III

I'm afraid cache are separated for each tMap.

Maybe you should to query the dates once and store the result into a tHashOutput and replace existing tDBInput for this table by a tHashInput.

Anonymous
Not applicable
Author

Thank you @TRF,

 

Yeah, I think on this solution too. The problem in this one is that I will need to read all data inside file 2 times, one to load the thashoutput and the other to do what I'm doing. Do you know another solution to avoid this 2 times read?

 

Luiz Ramos

TRF
Creator III
Creator III

Rows pushed into tHashOutput are in memory, not into a file. How many rows do you expect in the dates table?
Anonymous
Not applicable
Author

Yeah, I know.

 

What I'm saying is that I will need to read the data from the original file 2 times, one first to populate the thashoutput and the second to do what my job is doing. Follow an image from what I think I need to do if I use thashoutput:

 

0683p000009M9PD.png

 

Luiz Ramos

TRF
Creator III
Creator III

Should be like this, no?

0683p000009M9aP.png

Anonymous
Not applicable
Author

Ahhhh ok, sorry for that, let me share the scenario.

 

I'm using the get_dates as an example, but I want to perform the same action in all other lookup from the first image. The date table has 10000 records right now (this will grow up). But the other lookups will have more than 1 million of records. So what I don't want to do is get all records from each table and insert in thashoutput, because will crash the memory. What I think to do is to populate the thashoutput only with records that exist in the file that I'm reading, to avoid this memory crash. But for this, I think I need to do what I show in the second image.

 

For example, if I have 10000 records in date table, but my file has only 2 different dates, I only want to store in thashoutput these 2, and not all 10000.

 

Getting this same example, I have another question. One way to avoid all the workaround in our last posts, is to delete the second tMap and find a way to send the 2 different output from the original file to the same tMap. With this, I don't need to use the second tMap and the "Reload at each row (cache)" will work successfully. However, the 2 outputs have different schemas, and I know that Talend doesn't support "circle execution flow". Follow below an image of my entire job:

 

0683p000009M9B2.png

 

Do you have any suggestion for that?

 

Thank you!

 

Luiz Ramos