Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have a job that takes data from multiple ODS tables join them with multiple tMap and insert them to a table. I should have around 80GB of data and the main flow has around 85 000 000 rows (around 15 GB).
All the lookup tables are stored in temp files and RAM available is 25 GB for this job.
The insert are in batch and manual commit.
Even with this, the job is quit slow and turns for several days without ending yet.
Is there another kind of optimization I can do beside changing Talend maps to sql code ?
The problem is clearly not coming from the SQL engine.
What do you think is the average time for Talend to manage 80 to 100 Gb of data ?
Thanks in advance
Regards,
Sofiane
Hello,
We are working together with the dba on these flows and everything is ok from that side.
For the bulk component I've read that the running job should be in the same server as the SQL server which is not the case for me.
Thanx
Sofiane
did you try to increase this option in Run Job tab
Use specific JVM arguments by increasing the Xms and Xmx?
the default is: Xms256M, Xmx1024M
You could increase to Xms1024M, Xmx4096M
Hello,
I am using 30GB of ram already and this is not a ram problem.
Thanx
Sofiane
Hi,
The message you are getting, as it says, is that the connection of closed. So, this could be one of 2 things really:
The most likely cause is the 1st option, check with the DBAs if there's any open connection timeouts, etc. Is the destination on-premise or cloud or elsewhere (with network contention)? Either way, I'd consider splitting the job into 2 distinct sections, one of accumulating the data you want to put into the DB (into a temp file) and the actual output of the data into the DB (from temp file to DB).
David,
Thank you for your reply.
For sure it doesn't come from server, as previously told, I work with the DBA on it.
Is there a possibility to set the Talend timeout to 0 ( I don't know where is this parameter) I only know that JDBC timeout is default to 0 ?
The infra is all on-premise and Talend server is connected to SQL server through Intranet.
Is it possible to use the bulk component while Talend job is not in the same server as SQL ? Or should I manually manage the temp file ?
Have a good day.
Sofiane.