Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I've a raw file which has data which looks like this:
account_code|payment_period|payment_date|transaction_type_code|fund_type_code|amount|next_date|payment_status
7H0068T1|201901|2019-01-15|RFP|C|2521.40000000|2019-06-14|DONE
7H0068T1|201901|2019-01-15|RFX|C|678.60000000|2019-06-14|DONE
7H0068T1|201901|2019-01-15|WHT|C|800.00000000|2019-06-14|DONE
7J0004T1|201912|2019-12-13|RFP|C|985.31000000|2019-06-14|PENDING
7J0004T1|201912|2019-12-13|RFX|C|614.69000000|2019-06-14|PENDING
7J0004T1|201912|2019-12-13|WHT|C|400.00000000|2019-06-14|PENDING
I want to sum up the values from the amount column, for each account (account_code), for each day (payment_date) -> only for the records where transaction_type is RFP or RFX. So in the above data, for 7H0068T1, I need to sum up 2521.4 and 678.6 (and exclude WHT/ 800.0)
The iAggregate node doesn't seem to work for me or I am missing something.
Technically it's a combo of 'group by' and 'where' and the tAggregate only provides me the option to group by
Hi,
If you are using a DB, why you need to combine the dataset? You can add a tDBOutput to each stream so that each dataset will be loaded independently.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
The reason is that I wish to hit the DB once. As in, combine everything into one stream and then update the DB using tDBOutput
Hi,
I don't think DB will have any serious impact if we hit 3 times instead of 1 especially in batch mode.
If you want to use a single session, you can use a tDBConnection at the beginning and use the same connection for all 3 DB objects. End of day, you will have to use DB resources to push based on the number of records.
But if you really want to collect the data to one dataset and then push to DB (definitely not my preferred method), you can first collect the data to three tHashOutput components (where the second and third hash will be in append mode to first one). Then read this Hash using a tHashInput and then push to DB (this means memory overhead due to usage of Hash).
You can use files instead of Hash to do the same operation but this means you are doing additional IO and CPU operations. So my view is that the optimum way is to push through three DB components.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂