Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
What I'm trying to do is after aggregating the rows, there must be a two results - one containing result of aggregation and one containing unaggregated row.
1. Problem
When I used tAggregateRow component, it has given me the wrong result as in below screenshots:
Why I'm getting null values in a resulting columns - PODate & VendorNumber ?
Thanks !
Thanks @rhall for let me know tAggregateRow component is not sufficient for my requirement.
The solution suggested by you is good but I can't apply in my case as raw data has no any fix data type of any column, so sort would be not a good idea. Well, I've achieved the result using hashing and then inner join in following way:
1. Changed schema of tAggregateRow component - removed 2 columns for which I was getting null values
2. Used inner join to check for match and unmatched
Result:
I don't know if its an efficient solution, but its working perfectly.
Thanks
Hi,
The null values are caused by not including the fields in the group by. On the other hand, when you do include VendorNumber in the aggregation, it will not do the aggregation you wish. What exactly do you expect as result for 11676 ?
To distinguish unaggregated records, you could consider adding an extra output field and add an count. When the count==1 then it is not aggregated.
Thanks @saukema for your quick reply.
"The null values are caused by not including the fields in the group by" - How to do this ? I've tried but not getting desired result.
For 11676 I expect the result to be :
11676 | 28-07-2016 | BLUE3 | 30-09-2016
So there should no null.
To distinguish unaggregated records, I used tMap and its working fine. The only problem is of null values.
So you want to group by PONumber, return the Max LastUpdated date and return the other column values that correspond to the LastUpdated date (Max) returned? This is not so easy with this component on its own. But you can engineer a solution.
1) Order your data by PONumber and LastUpdated date using a tSortRow. Ensure that your order by PONumber first and LastUpdated second.
2) In your tAggregate component, group by the PONumber (as in your example) but select the LAST function for ALL output columns (assuming that your ordering is set to order by date ascending in your tSortRow component.
This will mean that your max date will be returned with the values that correspond to that.
Thanks @rhall for let me know tAggregateRow component is not sufficient for my requirement.
The solution suggested by you is good but I can't apply in my case as raw data has no any fix data type of any column, so sort would be not a good idea. Well, I've achieved the result using hashing and then inner join in following way:
1. Changed schema of tAggregateRow component - removed 2 columns for which I was getting null values
2. Used inner join to check for match and unmatched
Result:
I don't know if its an efficient solution, but its working perfectly.
Thanks
I'm confused, if you cannot sort because of a lack of fixed data types, how were you able to use the Max function successfully? The Max functionality and Sort functionality are practically the same
That would suit using the way I suggested, which would be a little bit more efficient than your workaround. But if it works 🙂
I solved it in another way which might be easier.
You could add the columns you need to the operation list in the component properties and use one of the appropriate functions like "first" wich will return only the first row value, or "list" if you want to groups the values in a comma separated string.
Check this screenshot:
Regards!