Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi team,
I need to implement batch processing in my talend job. how can i achieve it. scenario as below.
Suppose i have 30000 records in my file and i need to process 1000 records at one time and after that next 1000 records will process.
How can i achieve this scenario. 30000 records means 30 batches of records. Please help me with this scenario. I am using Talend data fabric 6.4.1 version.
Thanks,
Bhushan
Hi TRF,
Sorry for the late reply. I have not tried given solution cause of urgent deliverables. I will let you know when i will try this solution.
Thanks for the answer mark.
Thanks,
Bhushan
Hi,
i want to create a talend job where my database table has 100,00000 plus records and i want to load all the records to a file .
below approach takes 5-6 hours.
toracleinput-->tfileinputdelimited
can anybody help me to load the data faster?
can i run the job to load 100 or 1000 rows at a time so that it will be loaded fast? i have also used tflowto iterate -->tfixedflowinput and configured to iterate 100 executions but the job is running very slow after a certain time.
hi @rhall
I read your code, and following is my understanding:
- The first tJavaFlex is to split the data into many map item ( each map item contain Interger value and an ArrayList)
- The second tJavaFlex is to consume the map item list
But how to process the map item list one by one?
Is that the link from tJavaFlex 1 to the tJavaFlex2 is Iterate? ( mean tJavaFlex1 ---Iterate---> tJavaFlex2)?
how can we link them to next step of processing one by one?
Thanks
You basically have the idea. In order to release the data (each ArrayList from the tJavaFlex) you will need to list the HashMap keys and iterate over them, passing the key to the second tJavaFlex. So for each iteration you will release all of the ArrayList values in blocks of however mean you grouped then by.
Sorry about the late reply @phancongphuoc. I hope you had a great new year.
In answer to your question, all you need to do is come up with a way of iterating over the different batches. This will very much depend upon what you are trying to achieve. But let's say you are simply using a tFlowToIterate component to link to the second tJavaFlex which releases the rows per batch. If you take a look at this code (taken from the example above) you will see I hardcoded it to retrieve only batch 0. Change this to use a value set in the globalMap by the tFlowToIterate and that solves your problem.
//Retrieve a batch from the HashMap. YOU WILL NEED TO MODIFY THIS TO SUIT YOUR REQUIREMENT. I have hard coded it to only batch 0 java.util.ArrayList<row1Struct> array = (java.util.ArrayList<row1Struct>)map.get(0);