Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik and ServiceNow Partner to Bring Trusted Enterprise Context into AI-Powered Workflows. Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Batch processing in talend job.

Hi team,

I need to implement batch processing in my talend job. how can i achieve it. scenario as below.

Suppose i have 30000 records in my file and i need to process 1000 records at one time and after that next 1000 records will process.

How can i achieve this scenario. 30000 records means 30 batches of records. Please help me with this scenario. I am using Talend data fabric 6.4.1 version.

 

Thanks,

Bhushan 

Labels (2)
15 Replies
Anonymous
Not applicable
Author

Hi TRF,

Sorry for the late reply. I have not tried given solution cause of urgent deliverables. I will let you know when i will try this solution.

Thanks for the answer mark.

 

Thanks,

Bhushan 

sunny3
Contributor
Contributor

Hi,

i want to create a talend job  where  my database table has 100,00000 plus records and  i want to load all the records to a file .

below approach takes 5-6 hours.

 toracleinput-->tfileinputdelimited  

 

can anybody help me to load the data faster?

 

can i run the job to load 100 or 1000  rows at a time so that it will be loaded  fast? i have also used tflowto iterate -->tfixedflowinput and configured to iterate 100 executions  but  the job is running very slow after a certain time.

 

 

Anonymous
Not applicable
Author

hi @rhall 

 

I read your code, and following is my understanding:

- The first tJavaFlex is to split the data into many map item ( each map item contain Interger value and an ArrayList)

- The second tJavaFlex is to consume the map item list

But how to process the map item list one by one?

Is that the link from tJavaFlex 1 to the tJavaFlex2 is Iterate? ( mean tJavaFlex1 ---Iterate---> tJavaFlex2)?

how can we link them to next step of processing one by one? 

Thanks

Anonymous
Not applicable
Author

You basically have the idea. In order to release the data (each ArrayList from the tJavaFlex) you will need to list the HashMap keys and iterate over them, passing the key to the second tJavaFlex. So for each iteration you will release all of the ArrayList values in blocks of however mean you grouped then by.

Anonymous
Not applicable
Author

hi @rhall
so, how can we return segments one by one in the second tJavaFlex?
Currently I see that each row input from first tJavaFlex will go directly to the input of the second tJavaFlex
Anonymous
Not applicable
Author

Sorry about the late reply @phancongphuoc. I hope you had a great new year.

 

In answer to your question, all you need to do is come up with a way of iterating over the different batches. This will very much depend upon what you are trying to achieve. But let's say you are simply using a tFlowToIterate component to link to the second tJavaFlex which releases the rows per batch. If you take a look at this code (taken from the example above) you will see I hardcoded it to retrieve only batch 0. Change this to use a value set in the globalMap by the tFlowToIterate and that solves your problem.

 

//Retrieve a batch from the HashMap. YOU WILL NEED TO MODIFY THIS TO SUIT YOUR REQUIREMENT. I have hard coded it to only batch 0
java.util.ArrayList<row1Struct> array = (java.util.ArrayList<row1Struct>)map.get(0);