Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Handling Large Lookup in tMap(10 Million+)

Hi,
We have Talend job (Loading Fact Table in DW) in which we use more than 4 lookup tables. Among those lookup tables only one table has large amount of data (10 Million). Whenever the Job is executed we get error as "Out of Heap Memory".
RUN 1 :
As suggested in Talend Help Site i have tried to Increase the JVM parameters. Even after increasing the Job am unable to execute the Job. 
JVM Parameters:
Xms256M
Xmx1610M
Source - SQL Server
Lookup/ Target - Oracle.
In Each of the lookup table we have enabled CURSOR. 
RUN 2:
Also tried to load  with lookup data stored in the Local Directory by enabling the Store Temp Data directory in tMap.
The problem with this method is We are unable to Load all the data from source to Target. For Example : If the Source has 10 Millions records am able to Load only Half Million record into Target (Meaning lookup fails for not processed records).
Also the time taken is more to process.
Please Note:
RAM Speed        -    4GB
In both these we were unsuccessful, is there any way in talend to Handle the lookup effectively.??
If so please let us know..!! Inputs would be helpful..

Below i have also attached my Job Screen Shot:
  0683p000009MEnh.jpg
Labels (2)
16 Replies
Anonymous
Not applicable
Author

Hi,
Use appropriate Xms and Xmx values which means increase both Xms and Xmx values accordingly.
Also Increase the cursor size.
Read only those columns which are required to look up.
Enable parallel look up option in tMap.
Thanks,
Bhanu
Anonymous
Not applicable
Author

Hi Bhanu,
1.Can u please provide us the JVM argument Maximum limit for 4GB RAM.
2. We have tried increasing the Cursor size after certain limit it is throwing error
3. We took only 2 id columns that are required
4.Can u please give us little explanation on Lookup in Parallel? So that we will justify our approach.

Thanks
Arul
Anonymous
Not applicable
Author

Any Inputs on this.. Still we are facing same issue..!!
Anonymous
Not applicable
Author

Hi,
If you have 4GB RAM, then you can think of using 4096m for -Xmx.
More over I would suggest you to split the job into two subjobs. As far as your two columns are concerned, then length of two columns could be 16 bytes, then total space required could be 10m * 16 bytes = 160m bytes
1000000kb = 1GB
160m bytes = 0.1 GB.
There should not be any problem in doing this.
You can perform similar calculations on your actual data and estimate the memory requirement. If you can break the job into two subjobs, then memory management would be efficient.
Thanks
Vaibhav
Anonymous
Not applicable
Author

Hi,
As said we have 4GB RAM, even then i could not increase the Java XMX more than 1610M. If i do then i get below error message
"Could Not reserve enough space for object Heap"..
Thanks
Arul
Anonymous
Not applicable
Author

hi all,
as your large volume is a lookup try to reload data at each time filtering data in a where clause if it's possible :
https://help.talend.com/search/all?query=Handling+Lookups&content-lang=en
regards
laurent
Anonymous
Not applicable
Author

Hi,
We went through this before, but haven't implemented as it suggests that it is useful for Large Lookup where the source data is pretty small.   In our case the Source data is also equally large as Target Lookup.
Thanks
Arul
Anonymous
Not applicable
Author

What is the java version you are using. I have setup 10240m at my client place with java 1.7.0.
Have you tried with 1610m?
Have you tried breaking your job into two sections..?
Vaibhav
Anonymous
Not applicable
Author

did you try the option store on disk for tMap ??
regards