
Specialist III
2009-09-30
05:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
MySql ETL: Incremental updates to aggregate tables
We have MySql databases that have many tables that each contain millions of rows of data (e.g. 50mil rows). We are investigating ETL tools that would allow us to create a data warehouse that creates aggregate views of these tables that are updated whenever the rows change, or new rows are added. Running a query to regenerate the aggregated view would take hours (if not days) to complete, so we would need to incrementally update the aggregate view as the source data changes. Ideally, these updates would happen in real-time (or near-real-time).
Can anyone provide some insight about whether Talend would be able to help us in this type of scenario? I assume this is a pretty common use case, so please feel free to contribute your suggestions on how you handle this scenario.
Thanks in advance!
David
Can anyone provide some insight about whether Talend would be able to help us in this type of scenario? I assume this is a pretty common use case, so please feel free to contribute your suggestions on how you handle this scenario.
Thanks in advance!
David
196 Views
2 Replies

Anonymous
Not applicable
2009-10-01
07:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi david,
if i understand you correctly you are looking to capture change from the 'detail' table and propagate the measure values to your aggregated view/table using an ETL tool because it is too slow to do with your DB.
i would still use the DB to create a mechanism for identifying record changes for inserts/updates date column and valid/invalid column or a table with IDs of deleted records.
and then creating a Talend job to integrate those changes into your aggregated view/table.
hope it helps you.
if i understand you correctly you are looking to capture change from the 'detail' table and propagate the measure values to your aggregated view/table using an ETL tool because it is too slow to do with your DB.
i would still use the DB to create a mechanism for identifying record changes for inserts/updates date column and valid/invalid column or a table with IDs of deleted records.
and then creating a Talend job to integrate those changes into your aggregated view/table.
hope it helps you.
196 Views

Anonymous
Not applicable
2009-10-09
07:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
196 Views
