Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
jtompkins
Creator
Creator

Historical snapshot fact tables

I was wondering if any other Compose users use the product in a similar way as us.

The data mart (TDMA) tables that are generated in Compose are not what our end users consume. We process a current snapshot daily in Compose and then load that data into a different database, which is what our end users consume via a BI tool. We have a few very large historical fact tables with daily snapshots going back several years. Does anybody else have historical data needs and do not use the TDMA tables as their final data mart source of truth?

1 Solution

Accepted Solutions
TimGarrod
Employee
Employee

Understood.  Then I think there are 2 approaches to this - 

1.  Do what you are doing. 🙂   

2. In the central model layer, I've seen customers with similar requirements run loads daily and add a DATE column as part of the key using  <current_date> as the date each time the data is processed.  This allows Compose to manage the snapshots over time.    From a performance perspective you can potentially also have a specific ETL set that uses "optimized for initial load" - where we perform INSERT only operations into the HUB table.   There are etl set features that are disabled when using this method - so be aware of that, but I imagine if you are performing a non-Replicate based process today that you may not need this enhancement.

 

Hope that helps!

 

Tim.

View solution in original post

4 Replies
TimGarrod
Employee
Employee

Hi.

Customers leverage the data mart and central data model in many different ways.   While some certainly take the output of the data mart and perform additional processing, this type of requirement can often be supported by leveraging  “State-Oriented” fact tables to provide daily snapshot support.    This can provide benefits of reduced storage and processing requirements each day while supporting daily snapshot functionality and enabling incremental processing instead of complete re-write each day.   The implementation (and whether it fits your needs) obviously depends on your specific business and analytics requirements.  

Can you provide some more info on the daily snapshot (I assume its a semi-additive measure problem) and why externally managed tables for consumers?  

 

 

 

jtompkins
Creator
Creator
Author

Currently, the one fact table we have in production contains contract data. The business requirement is to retain a daily snapshot of all of our contracts. We were originally using a state oriented fact table but we changed it to a transactional in order to keep the grain of one row per contract, as the state oriented fact was inserting multiple rows for SCD2 attributes. We also do not yet use Replicate, so our source data gets processed in its entirety daily. 

The reason why these tables are managed externally is because historically those have been the tables that our BI tool (OBIEE) have consumed, and to change this would involve a complete reworking of that data model. 

TimGarrod
Employee
Employee

Understood.  Then I think there are 2 approaches to this - 

1.  Do what you are doing. 🙂   

2. In the central model layer, I've seen customers with similar requirements run loads daily and add a DATE column as part of the key using  <current_date> as the date each time the data is processed.  This allows Compose to manage the snapshots over time.    From a performance perspective you can potentially also have a specific ETL set that uses "optimized for initial load" - where we perform INSERT only operations into the HUB table.   There are etl set features that are disabled when using this method - so be aware of that, but I imagine if you are performing a non-Replicate based process today that you may not need this enhancement.

 

Hope that helps!

 

Tim.

jtompkins
Creator
Creator
Author

I don't know if the 2nd option is feasible from a performance perspective, because our daily snapshots go back to 2012 and have ~500 million rows. To have that level of processing done daily would take forever. That said, I think our only option is what we are currently doing. The problem is that whenever the slightest change is made to the TDMA tables, Attunity wants to drop and recreate everything, and thus we lose referential integrity because when those tables get reloaded the keys are no longer referencing the same values as they were in our historical fact table.