Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Loading data to Amazon Redshift is very slow

I am evaluating Talenb and Amazon Redshift for big data solution. Unfortunately, it looks like Talend is unable to load data in reasonable rate (around 100 Row/Sec when doing insert only). I don't want to use another 3rd part tool just for the loading, and I don't want to use files and S3 and with COPY as well.
Any suggestion on how to optimized Talned job to boost performance?
Any future plans to improve that component?
Thanks!
Ophir
Labels (3)
3 Replies
Anonymous
Not applicable
Author

Hi,
Performance issue is usually caused by the DB connection or the job design, could you please post your job screenshots into forum so that we can address this issue quickly.
Best regards
Sabrina
Anonymous
Not applicable
Author

Hi Sabrina,
I found the problem.. there are 2 bottleneck in the process, first one is network, once I run the process on a server with high network connection it solve part of the problem. The second issue is the update else insert mode. Once I change it to insert only, I could run the loading in a batch mode. (Put 500 rows per insert work like magic). The data is loaded into a temp tables in Redshift and the update else insert operation is done by a join from the temp table to the target table as proposed by Amazon (Amazon best practice for upsert operation)
In that way I manage to insert 1000 Rows per second, the update else insert is very fast when done on Redshift db, so this design works for me.
I wonder if Talend could create a better upsert mechanism in the future that could use batch mode like the insert.
Thanks !
Ophir
Anonymous
Not applicable
Author

Hi,
It welcome to open a jira issue for your requirement of DI project on Talend Bug Tracker.
Best regards
Sabrina