
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Talend job to Insert 3 million data from positional file to staging table in SQL Server is very slow
My requirement is to load data from positional file to staging table(truncate and insert). Positional file resides in file server and SQL server installed on DB server and Talend job is running on file server. So, here Bulk insert can't be used.
I have created the job in talend and its inserting the data correctly, but the issue with this process is that it's very slow. Its processing 100-200 rows/s. I tried different ways to improve the performance but had no luck.
Here are the process and system details
System RAM:4GB/2 processors(Intel Xenon)
OS: Windows sever 2012 R2/Windows 7
Talend Data Management Version: 7.3.1
Approaches I tried
1. used tFileInputPositional(To load data from positional file)->tmap(no joins in tmap)->tDBOutput and JDBC connection
updated the parameters -Xms256M and -Xmx1536M
Used parallelization and created 2 threads from tFileInputPositional->tmap and then departitioned before tDBOutput. Even after this, it didn't improve the performance
Currently Talend job is running for more than one and half hours to load 3 million data to SQL server staging table. I'm able to insert same 3 million records to staging table in 10 mins using SSIS. How can i improve the talend job performance and reduce the data processing time to 10 mins like in SSIS?
Please let me know if anyone has any suggestions.
Thanks in advance!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, maybe you can send the result of the tmap in a hash, then read the hash to insert data in TDBOutput, also check that your tDBOutput is configured only on insert and not update or insert.
Send me love and kudos

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jeremy,
I tried what you've suggested above. Now I'm getting memory issue in tHashOutput component. Error message is "the paging file too small for this operation to complete".
Could you please tell me how to resolve this memory issue tHashOutput component?
Thanks in advance

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
else if thashoutput don't solve the problem, you could try to write data in a tempory folder, it's in basic settings of the tMap component
https://help.talend.com/r/HqPypUO_cLctKv_dAWcgFw/GKM0cFBcfseClZruxOzlTw
you could also increase the commit size in the tDbOutput component

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Even if I write data in temp folder, will this improve the job performance?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
there is a lot of parameters wich can influence your job performance, it could be great to have an idea of what you do in your tMap, also the configuration of your tDBoutput, it will better if you can send pictures of your job
also 4gb ram is too small for Talend project you should consider to user 8GB minimum
try use JTDS instead of JDBC, much faster
