Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
sushantV
Creator
Creator

How do I aggregate data, conditionally?

I've a raw file which has data which looks like this:

 

account_code|payment_period|payment_date|transaction_type_code|fund_type_code|amount|next_date|payment_status

7H0068T1|201901|2019-01-15|RFP|C|2521.40000000|2019-06-14|DONE
7H0068T1|201901|2019-01-15|RFX|C|678.60000000|2019-06-14|DONE
7H0068T1|201901|2019-01-15|WHT|C|800.00000000|2019-06-14|DONE

7J0004T1|201912|2019-12-13|RFP|C|985.31000000|2019-06-14|PENDING
7J0004T1|201912|2019-12-13|RFX|C|614.69000000|2019-06-14|PENDING
7J0004T1|201912|2019-12-13|WHT|C|400.00000000|2019-06-14|PENDING

 

I want to sum up the values from the amount column, for each account (account_code), for each day (payment_date) -> only for the records where transaction_type is RFP or RFX. So in the above data, for 7H0068T1, I need to sum up 2521.4 and 678.6 (and exclude WHT/ 800.0)

 

The iAggregate node doesn't seem to work for me or I am missing something.

 

Technically it's a combo of 'group by' and 'where' and the tAggregate only provides me the option to group by 

 

Labels (2)
12 Replies
Anonymous
Not applicable

Hello,

 

Please add a tFilterRow or tMap Column before tAggregateRow. tAggregateRow does not support Group By with Where.

 

Thanks,

Subhadip

sushantV
Creator
Creator
Author

That's exactly what I started doing. Can't the tFilter row output to multiple streams? Based on my data type, I'd want to output to three streams, for three types of data values.

Anonymous
Not applicable

Hello,

 

If you want to divide data in three output streams along with filter, please use tMap instead of tFilterRow.

 

Thanks,

Subhadip

Anonymous
Not applicable

Hi @sushantV 

 

    Could you please explain more when you are saying output to three streams? Right now, the information provided is partial so the solutions provided also is based on the inputs from your end.

 

     If your objective is to filter the data to multiple groups and then take the aggregation, you can do that way. What you will have to do is to filter the unwanted records out (you can send the unwanted types to different stream by Reject condition) and then send to different flow. 

 

     Now if your objective is to take aggregate for each transaction type and then send it to different flow, that is also possible. After taking the aggregation, you can send the result set to a tMap where you can add the output expression filter to pick necessary transaction types.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

sushantV
Creator
Creator
Author

@nthampi 

Nikhil, there are three subtypes in the data and I need to aggregate just two of them. So I split up the data into three streams (one for each subtype), using the tMap. Let's call these three streams A, B and C. I intend to sum up the A and B, and ignore the C for now. I thought I'd be able to combine them using the tMap but I gues I am missing something,

 

Here is how the flow looks like: 0683p000009M5Te.png

 

I inted to combine the RFP and RFX data so that I can aggregate the amounts for each account, on each day. That can be done using tAggregate but I need to combine the data, before I can attempt the tAggregate node.


Screenshot from 2019-06-06 14-23-18.png
Anonymous
Not applicable

Hi,

 

    Let us first do the segregation of data for the two types you want.

0683p000009M5Nh.png

 

If you do like above, it will pick both types and remove any other types. Now, send this data to the aggregation component. Do not add the transaction type as a field for aggregation so that your sum will be based on only key fields like account number.

 

In this way you should be able to get your results.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

sushantV
Creator
Creator
Author

@nthampi I am dumb. Don't have to over-complicate things. Your solution is elegant. Thanks.

 

BTW I still need a way to combine flows. A further requirement wants me to sum up for all the subtypes. I can find my way around it but I am curious about knowing how could I combine two subtypes (as in the screen I shared), into a tMap component.

 

 

Anonymous
Not applicable

Hi,

 

    No need to worry. You will soon become a master of Talend 🙂

 

    Coming to your followup query, you can do that before segregation of data using tMap. I would suggest you to use a tReplicate to duplicate the data to two streams. One stream can be used for total sum where you will send the output to a taggregaterow directly. Another stream can be passed to tMap where you will separate the data based on transaction types followed by taggregaterow.

 

    Hope I have answered all your queries for this post. Please spare a second to mark the topic as resolved 🙂

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

 

 

sushantV
Creator
Creator
Author

Almost there. Even after I split the flow in two and process them separately, I need to combine them all, before putting that info in the database. How can I do that?