Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Aggregation on large Table Data

I have a table containing 33 million records in MySQL database. I need to aggregate the rows. Could you help to know the best/efficient method to solve this with performance?
1. Extract all the 33 million records into Talend,  sort and aggregate in the Talend tool
2. TMySQLInput with following extract query
   SELECT A,B,C, SUM(D),SUM(E)
   from table1
   group by A,B,C
3. Any other better way other than listed above
Thanks,
Srini
Labels (3)
2 Replies
Anonymous
Not applicable
Author

Hi,
Here is a component  TalendHelpCenter:tAggregateRow which  receives a flow and aggregates it based on one or more columns. For each output line, are provided the aggregation key and the relevant result of set operations.
Please take a look at the related scenario in this component reference.
Best regards
Sabrina
David_Beaty
Specialist
Specialist

Hi,

 

A combination of tMySQLInput (with order by criteria) and tAggregateSortedRow components together. Make sure you correctly set the "Input Rows Count" which means you'll probably need to read the data into an intermediate file.

 

tCreateTemporaryFile

|

tMySQLINput -> tFileOutputDelimited

|

tFileInputDelimted -> tAggregateSortedRow -> (whatever you want to do with the aggregated data)

 

Thanks

 

David