Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Need help to optimize my talend job

Hello reader,
I am new to talend and have very limited experience in it ,
My task required to perform daily incremental update from sql rds to redshift on daliy basis,
however my job runs with very slow transfer rate
details are listed below


my sql rds query is 
"select  *  from test.ankit2 where id >(SELECT COALESCE(max(id), 0) as id FROM test.stagetable)"


ankit2 is the table in myrds 
and 
stagetable is table in redshift

and used tmap component to link the component from rds input to redshift output component
Please have a look at the image 


Please have a look and provide your suggestion 
Any help will be appreciated
regards
Ankit
0683p000009MH8c.png

Labels (2)
6 Replies
Anonymous
Not applicable
Author

Hi,
Have you already tried to carry out a bulk insert of a defined set of lines instead of inserting lines one by one by using "Extend insert" option in tRedShiftOutput component?
Best regards
Sabrina
Anonymous
Not applicable
Author

Thanks for suggestion,
It helped now job is running much faster then before
regards
Anonymous
Not applicable
Author

Hi,
Thanks for your quick feedback.
Is it a normal speed? We saw that the row rate was 8.52 rows/s during the data processing in your screenshot.
Best regards
Sabrina
Anonymous
Not applicable
Author

Nope,
This is not the normal speed,
Its way much slower then what is desired ,
However after tuning the job as suggested by you ,it working much faster 
But however if i am trying to increase no of row counts per insert above 200 its getting me error 
Exception in component tRedshiftOutput_1 org.postgresql.util.PSQLException: An I/O error occured while sending to the backend.

Any suggestion for it 
regards
Ankit khanduri
0683p000009MGyP.png
Anonymous
Not applicable
Author

Hi,
An I/O error occured while sending to the backend.

It looks like there is some issue at the DB end processing for so much data at a time. Maybe some DB configuration issue(timeout set) or may be data transmission restrictions/limits.

Best regards
Sabrina
_AnonymousUser
Specialist III
Specialist III

Hello,
You should use postgresql output component to do a batch load in Redshift, in the advanced setting you have to:
[list=*]
  • Activate Batch size and set it to 100 0000 

  • Commit every 100000 row

  • Use cursor size in your input DB component and set it to 100 0000


  • In basic Setting
    [list=*]
  • Do Not use an existing connection in output

  • Use existing connection in input


  • [list=*]
  • Do Not use Talend Java component between input and output

  • Do all your transformation in SQL mode of input component 

  • Best Regards