Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello, here is an overview of a job I am currently working on. I would like to improve the performance after the tMap (8 seconds at this level, but the rest 4 minutes), but I have no idea. Here's what I've done so far:
- remove useless tHashOutput (I have only one)
- I made sure to keep only the rows and columns I need in tFilterRow and tFilterColumn
- I put cursors on all tdbOutput (even the problem is not before the tMap)
- I tested the parallelization with the right click option on the subjob "configure parallelization", but not much changeDo you have any ideas?
I can't remove any of the components from the right part because they are all needed for the result, but if there is one that can replace it and improve the performance, let me know.
Hi
Have a try the following points:
-Allocate more memory to the job execution.
-Store lookup data on disk instead of memory on tMap.
-Use a tMap to filter rows and columns instead of tfilerRow+tfilerColumn
-Create the DB connection on tDBOutput instead of using an existing connection.
If Database is installed on a remote server, it needs a good bandwidth. Writing data in a local file or database is more faster, compared with a remote DB.
Regards
Shong
Hi @Stéphane Barbezier , do you use insert and update option or only insert in tDBOutput components ?
Cause insert and update can bring you to really poor performances. It's better to separate insert and update when you begin to have huge dataset.
Send me love and Kudos