Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi
I created a main job to execute two subjobs in the same time, each subjob allows to insert 1 million document in a mongo database, so the idea is to import data from a json file and do a transformation on the id and insert the data . The size of each json file is 11 Ko.
My problem is that the execution time is too long, the job takes 5 hours to complete the execution.
Here is the main job :
Here is the first subjob :
Here is the second subjob :
I enabled the Multi thread execution for the main job and the two subjobs also i enabled the parallel execution of the iterate component and set the value to 5 .
So any ideas to help me to improve performance and minimize the execution time ?
Thank you in advance
Hi,
If you are using the same mangodb connection and the jars for these 2 child jobs, you can combine them into one child job.
This way, the jar loading and mongodb connection part will take place once instead of twice which should save some time. You can then run the subjobs that load your data from file to mongodb in parallel using tParallelize component. Give it a try only in case you are open for a design change.
Also, regarding multi-threaded execution, check the number of processors for your execution server.
Refer - https://help.talend.com/reader/68cjh05i2K43RJsuYOQJ1Q/JUkDAmEh26ipMZfskbUXjQ
- This feature is optimal when the number of threads (in general a subJob counts one thread) do not exceed the number of processors of the machine you use for parallel executions. Otherwise, some of the subJobs have to wait until any processor is freed up.
Thanks