Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I need to group and then filter out certain rows of data within each group and I am struggling on how to achieve this.
Here is some data (columns on top):
Id, lastmodifieddate, relatedto,address
1,2017-1-1,X,address1
2, 2015-1-1,Y,address2
3, 2017-6-6,X,address3
4, 2015-1-1,Z,address4
5, 2014-1-1,Y,address5
In this case, I want to group the records by relatedto and then only return the record with the most recent lastmodifieddate. So, for the example above, I want the results to only return 3 records as shown below:
Id, lastmodifieddate, relatedto,address
2, 2015-1-1,Y,address2
3, 2017-6-6,X,address3
4, 2015-1-1,Z,address4
I used tAggregateRow with group by relatedto and max operation on lastmodified date to get the row. However, when I use the tLogRow component to log the output, I only see the relatedTo and lastmodified column, but the Id and address columns are null.
How can I get the Id and address column to show the value for the row that has the latest lastmodified date for each relatedto?
I am new to Talend so be kind 🙂
Thank you.
Hi,
The simplest solution (here the input is from tFixedFlowInput for testing):
- tSortRow to order records on lastmodifieddate descending order
- tUniqRow to eliminate duplicate records on relatedto
- tSortRow to order records on id ascending order
That's all
Hi,
The simplest solution (here the input is from tFixedFlowInput for testing):
- tSortRow to order records on lastmodifieddate descending order
- tUniqRow to eliminate duplicate records on relatedto
- tSortRow to order records on id ascending order
That's all
Thank you TRF.
I was missing the tUniqRow piece.