Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Experts,
We are doing a migration from oracle to postgre using talend 6.4. It's observed that java.lang.OutOfMemoryError: GC overhead limit exceeded is thrown when the volume increases. We currently had heap size set to 12 GB which isnot enough for 18 million records, when the heap was increased to 15 gb job completed. We have around 100 million records to be migrated in production , request you to help optimize memory utilization and any known hacks.
Found some documentation to use disk to store data processed via tmap instead of memory(https://help.talend.com/reader/EJfmjmfWqXUp5sadUwoGBA/J4xg5kxhK1afr7i7rFA65w) but i am not able to find the settings to enable store temp data as true(https://help.talend.com/viewer/attachment/j7gt4x19HWatE4J4vdPsKQ).
We also tried removing the tmap still there is memory issue. The input of the job is a view connecting multiple tables. we have multiple sub jobs for different high volume tables and it's observed that memory is not getting released after processing high volume data jobs.
Thank you in advance,
Rajeswary S
Hi
To enable store temp data as true, configure it in the lookup table settings, see
and then select a temp data directory.
Enable to store temp data on disk on other memory consumption component such as tSortRow. It is better to share a screenshot of your job design, we will see if it is possible to optimize the job.
Regards
Shong
settingsnotavailable
Hi Shong,
Thank you for the reply, we do saw an option mentioned in a link, but it is not visible in Talend studio 6.4 version which we are using .
Warm Regards,
Rajeswary S
Hi Shong,
Kindly advise on how to proceed, we are blocked.
Warm Regards,
Rajeswary S
Hi,
Aside from the suggestions from @shong, which should work, for components like tMap, tSortRow and tUniqRow (to mention a few), also consider whether migrating 100Million rows in one go is the best approach. Is there some form of selectable key like a date range or some other blocking key that allows you to break the data up into more manageable chunks and migrate a chunk at a time.
This would also improve the recoverability of the job if it should have an issue part way through the migration.