parallel execution of serial statements

Anonymous · ‎2018-10-10

Hi All,

I'm working on a job that needs to execute in parallel two statements serial.

Assume the following

I have two target tables order and invoice and i have 4 statements

target_table, execution order, code

orders 1. truncate table stg.orders;

orders 2. insert into stg.orders select * from src.orders

invoice 1. truncate table stg.invoice

invoice 2. insert into stg.invoice select * from src.invoices

Now I want target_table to be executed in parallel, but code in serial

In a way that the following is executed:

thread 1: truncate table stg.orders; insert into stg.orders select * from orders;

thread 2: truncate table stg.invoice; insert into stg.invoice select * from invoice;

I'm using enterprise edition Talend data services platform 6.3.1.

Any help is very much appreciated.

Regards Gilles

Anonymous · ‎2018-10-11

Hi,

There are multiple ways to complete this scenario. A quick way is as below where you can send the control in parallel manner to two child jobs. In each child job, you can run the statements using multiple tDBrow.

Could you please try and if the logic has helped, please mark the topic as resolved. Kudos are also welcome 🙂

Warm Regards,

Nikhil Thampi

sunny3 · ‎2018-10-11

@nthampi wrote:

Hi,

There are multiple ways to complete this scenario. A quick way is as below where you can send the control in parallel manner to two child jobs. In each child job, you can run the statements using multiple tDBrow.

Could you please try and if the logic has helped, please mark the topic as resolved. Kudos are also welcome 🙂

Warm Regards,

Nikhil Thampi

Hi,

i want to create a talend job where my database table has 100,00000 plus records and i want to load all the records to a file .

below approach takes 5-6 hours.

toracleinput-->tfileinputdelimited

can anybody please help me to load the data faster?

can i run the job to load 100 or 1000 rows at a time so that it will be loaded fast? i have also used tflowto iterate -->tfixedflowinput and configured to iterate 100 executions but the job is running very slow after a certain time.

Anonymous · ‎2018-10-11

Thanks, i know about the tparallelize option, but it's a little more complex.

orders 1. truncate table stg.orders;

orders 2. insert into stg.orders select * from src.orders

invoice 1. truncate table stg.invoice

invoice 2. insert into stg.invoice select * from src.invoices

I want orders and invoice executed in parallel, so how do i force that, and then i want 1,2 executed serial, parallel by targettable, serial by order

Anonymous · ‎2018-10-12

@gillesp - You can easily do it using tparallelize mode as shown in screen shots in my previous posts.

@pati.pranati - You do not have to use tflowtoiterate as it will serialize the flow further. There are multiple ways to do data extraction for your problem.

a) In same job, use tparallelize to create multiple parallel flows where each parallel flow should extract specific time frame data from source database. Please increase your memory resources for your job to handle the extra data volume.

b) You can parallelize further by scheduling multiple jobs having same logic where each job is having tparallelize option to pick data from different time frame of source table. If you do it this way, the advantage is that in case of any error, you do not have to run the entire job once again. You need to run only the specific job which has failure.

Hope the answer has helped both of you and if it resolves your query, could you please mark the topic as resolved? Kudos are also welcome 🙂

Warm Regards,

Nikhil Thampi

sunny3 · ‎2018-10-12

Thanks nikhil it worked for me..

Talend Data Integration

v7.x