Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
shameer1
Contributor
Contributor

When to use tAggregateRow and when to use tSortRow + tAggregateSortedRow

I am new to talend. Can anybody let me know when to use tAggregateRow and when to use tSortRow ->tAggregateSortedRow as both will give the same output.

 

 

Labels (2)
6 Replies
Anonymous
Not applicable

Hi,

 

 Please refer below post where it is explained with an example.

 

https://community.talend.com/t5/Design-and-Development/resolved-taggregaterow-vs-taggregatesortedrow...

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

Anonymous
Not applicable

Hi,

 

tAggregateSortedRow aggregates the sorted input data for output column based on a set of operations.

 

tAggregateRow receives a flow and aggregates it based on one or more columns.

 

https://help.talend.com/reader/KxVIhxtXBBFymmkkWJ~O4Q/VVQYE5AV~OFaAnSfC13t2g

https://help.talend.com/reader/KxVIhxtXBBFymmkkWJ~O4Q/i_YvOl2oUaVpW1UTnJao_g

 

So the first one will avoid the sorting part since the assumption is that incoming data is already sorted. So for sorted data, tAggregateSortedRow will give better performance.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

 

shameer1
Contributor
Contributor
Author

Thanks Nikhil for the reply. But if the source data is not sorted then which one of below two will give better performance.

1:-tAggregrateRow 

2:- tSortRow ->tAggregateSortedRow

 

Or both of them will give the same performance

Anonymous
Not applicable

Hi Shameer,

 

    It will be more or less same performance since you have to same actions (either within one component or through two components).

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

 

 

David_Beaty
Specialist
Specialist

Hi,

 

The 2 options you suggest have one major functional difference - if your incoming data set you're try to aggregate is large then using tSortRow & tAggregatedSortedRow is the only way to go. tAggregateRow has to maintain the data set in memory, so will have issues as the size of the data set increases. Use the sort on disk functionality in the tSortRow and you'll be fine.

 

AnnaSchmd
Contributor
Contributor

Hi David,

 

Does tAggregateRow make a implicit sorting in memory? And then keeps tAggregateRow those sorted rows in memory?