Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all, I am drafting an ETL structure for one of our apps and faced one unsolvable question. Maybe you guys, can share your knowledge and help me with it.
We have one pretty big app with data from way back 2010. Every night I would like to do transformation just for freshest data (the one, that was modified during last 24 hours), but I don't know what to do when - historical data of one of dimmensional tables has changed. Should I reload all fact table or is there an easier way?
For example:
1. During Extract stage I add freshest data about sales from last 24 hours to main Raw_QVD.
2. During Transform stage I modify the freshest data from last 24 hours a by applying applymaps and various joins (let's say I join info about sales person). Then add this data to historical Transformed_QVD
3. During Load stage I load this data to final .qvw application.
What should I do if data about one sales person has changed (changed his name)? Should I reload all historical Transformed_QVD data? Or is there any other way?
It is most often referred to as "Slowly Changing Dimensions" and if you search on that on the community you will find many discussions covering this.
Maybe the most authorative ones are these by Henric Cronström: