Add Support for Distribution Method for data mart in Azure Synapse Analytics
Description:
In Azure Synapse Analytics, fact tables distribution method reverting back to round robin once the ETL workflow runs, It looks like the process drops and re-creates the fact tables each time (or whenever this happens, the distribution method is reset).
In addition, the distribution method can be set in the data warehouse, but is not available as a setting in the data mart. Therefore, the distribution method is set to "Round Robin" by default, instead of Replicate or Hash (which is much more desirable for dimension and fact tables respectively).
Value Proposition:
I constantly have to manually maintain the distribution method settings on data mart tables because the default behavior does not perform very well in a Microsoft Azure environment. Fixing this will make the product usable.
Target Audience:
Developers
How does this help the user:
Our developer staff will not have to manually maintain these tables every time the code is "generated"... which is often in a development process.
That is correct... we have standardized on "State Oriented" facts because the behavior of "Transaction" facts won't work for us. There is no update, that we could see from our testing, to Transaction facts even though our data warehouse could have more recent changes. The Transaction fact seems to only ever show the original version of any of its facts. Because of this, we are forced to use the "State Oriented" facts and use a view to show the latest version of each fact to simulate the "Transaction" fact. We also can show all the fact versions for those that desire that as well. In any case, we need support to control the distribution method in the data mart, the data warehouse, as well as the staging areas to get performance on Synapse.