Add data warehouse function to reinitialize the table[s] for a given entity or group of entities
There have been a number of occasions when after introducing an entity into the data warehouse model, and developing the data load process, where I have needed to change an approach or correct an error, and the ideal thing to do would have been to purge the table[s] (since there can be satellite tables involved, and also load error tables) back to pristine state, as created by Compose with various required null records, etc.
However, neither support nor I have been able to find a way to do this natively; usually I end up dropping the whole data warehouse, and recreating it and reloading it from scratch.
So far, this has not been fatal, but soon we will start to keep row history in the warehouse, and this process would mean losing all of our history.
Please, Compose needs a function to reinitialize the table[s] for a given entity or group of entities.
We are in the initial stage of Development and this will be a huge issue for us after production deployment. We definitely do not want to lose history in the WH tables. This is a must have for us.
The reason there is no support for dropping a single entity or several select entities because when reloading these entities there is no guarantee that the surrogate key of records will be preserved. Since other entities could include a relationship to the dropped entity with the surrogate key stored for that relationship, reloading could result in incorrect relationships.
A way to work around this would be to preserve the business-key to surrogate key mapping in the hub and then reload the table. This is obviously more complex than just drop&create. I'm opening this up for community feedback and voting.
I would like to suggest a new idea for Compose to preserve surrogate key values, both at the DWH level in the event of reinitializing entities, as well as the Data Mart level when dropping or truncating TDMA tables, preserving the VIDs. Would such a request even be plausible / realistic to implement?
>> A way to work around this would be to preserve the business-key to surrogate key mapping in the hub and then reload the table.
I think that this approach should be plausible, and actually not that complex, at least conceptually; implementation in SQL may be a little tricky, but perhaps not either.
I also would like to add that if we already have a table alone, and its at the bottom of the granularity, we should be able to drop and recreate it. and if its in the middle of a hierarchical relationship, then automatically we should be able to drop and recreate all the related tables.
I hit an issue where a DWH had to be fully recreated because 1 single new table which wasnt even loaded failed to be adjusted, and we couldnt even create the adjust script. (which is another problem). a simple solution would have been to drop and recreate that table alone. and not reload 60 million records for no reason.
NOTE: Upon clicking this link 2 tabs may open - please feel free to close the one with a login page. If you only see 1 tab with the login page, please try clicking this link first: Authenticate me! then try the link above again. Ensure pop-up blocker is off.