Solved: Improve the Load data performance in Oracle - Qlik Community

Anonymous · ‎2018-11-19

Hi All,

I am Loading data (~more than 5lacs records) from Oracle X table To Oracle Y table . And I am also using nextval as ID in OracleInput component as select statement. And just one to one mapping But it is giving very poor performance. I can use use cursor option but due to this i can get Heap Memmory error. Could any one suggest me how to reduce the time and improve the performance. It is taking more than 7 hours.

Thanks

Vishal Vats

Anonymous · ‎2018-11-26

Hi,

IF you are using parallelism, you are still using Talend's capability to process the data but you are not using Bulk load capability of database there. So the processing speed depends on two factors, Talend's parallel processing capability and database's capability to load the processed data in normal fashion (since we are not using Bulk components here). Eventually we are pressing the services of two different systems in a stressful way when a much easier way is available in the form of Bulk load.

Now, coming to the second query, where we are extracting the data from DB to file. Now this will take less time. The time difference will happen when you are trying to load the data using normal file to DB components as it will create lot of redo logs, less commit size etc. which will eventually reduce the throughput. At this juncture, if we are using a Bulk component, it will bypass the logging to minimum and will process at higher rowcount. These are the reasons why I always suggest to go for Bulk load.

Hope the answer made you a happy man 🙂

Warm Regards,

Nikhil Thampi

View solution in original post

Anonymous · ‎2018-11-21

Hi,

Could you please try Oracle Bulk load components when you are trying to load data like volume you have mentioned below?

Warm Regards,

Nikhil Thampi

ankit7359 · ‎2018-11-25

Hi @vishuoo1,

As per your scenario you have specified you are loading 5 lakhs change records, well it is a huge number though,so your source and target being same which is DB to DB it would be slower,as a result of which performance is hit,also another thing you had mentioned use of cursor,well if you use cursor..you might probably be able to achieve it,but the time taken to complete the job will almost be the same. So i have 3 possible solutions for your scenarios

1) Try following @nthampi's solution,maybe you will be able to achieve the desired solution.

2) Try parallelization by activating "Set Parallelization" and try performing rest of the operation.

3)Maybe you could try loading DB to file and then file to DB which would improve the performance but you must enable use of cursor's which would further improve your performance.

Pls reach out to the Talend Community,if necessary.

Thanks,

Ankit

Anonymous · ‎2018-11-26

@ankit7359

I am afraid the second and third approach mentioned in your post may give performance hiccups for bigger files. So my suggestion will be to stick with Bulk load components.

Warm Regards,

Nikhil Thampi

ankit7359 · ‎2018-11-26

Hi @nthampi,

Greetings of the day,

Well thanks for correcting me, but i do have query over this.

Pls correct if i m goin wrong.... Well when we enable Parallelization then we invoke "Multi-Threading" of the job, this being said even in the row settings we change it to use the Multi-Threading which would split the records as per the processors which can be allocated for handling these split threads..

Also can you pls specify why DB to file and file to DB is not recommended....Because till where i know.. DB to file takes less time to load... Pls correct me if i m wrong anywhere in this context.

Thanks,

Ankit

Anonymous · ‎2018-11-26

Hi,

IF you are using parallelism, you are still using Talend's capability to process the data but you are not using Bulk load capability of database there. So the processing speed depends on two factors, Talend's parallel processing capability and database's capability to load the processed data in normal fashion (since we are not using Bulk components here). Eventually we are pressing the services of two different systems in a stressful way when a much easier way is available in the form of Bulk load.

Now, coming to the second query, where we are extracting the data from DB to file. Now this will take less time. The time difference will happen when you are trying to load the data using normal file to DB components as it will create lot of redo logs, less commit size etc. which will eventually reduce the throughput. At this juncture, if we are using a Bulk component, it will bypass the logging to minimum and will process at higher rowcount. These are the reasons why I always suggest to go for Bulk load.

Hope the answer made you a happy man 🙂

Warm Regards,

Nikhil Thampi

Improve the Load data performance in Oracle

Talend Data Integration

v7.x