Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
In the Data Warehouse mappings, there is a lookup functionality built-in to perform any code lookups from the landing tables in Compose.
I was looking for similar functionality in the Data Mart ETL too.
We have built dimensions where some of the column are codes only known to the source system, in the DataMart layer we want to provide meaningful descriptions for those codes by doing a lookup from the code description dimension (which has very static data that doesn't change very often).
One of the ways to implement this is using custom ETL code in Compose, but since we have so many code columns we are looking for a standard lookup functionality in the Data Mart expressions to avoid maintaining a lot of custom code.
Also we want to avoid doing this type of data denormalization in the warehouse., this requirement fits well in the Data Mart/BI layer to be built for reporting.
Do you have a similar problem to deal with in Compose?
Hi, the typical way to handle this would be to create the lookup tables in the DW model and define the relationships from the code column to the lookup table code in the main model.
This allows the description to be pulled into the data mart automatically, reduces the need to denormalize the data in the DW model, but allows for easy altering of the code / descriptive values if needed through the normal Compose process.
@TimGarrod it will work, but we have hundreds of such code columns in our model which means we will have to define relationship for each one of such columns in our model, on a different note we also observed there is a negative impact on warehouse ETL performance, as you add more relationships in the model in Compose, hence we did not go down that path.
We need the lookup capability only as per need basis in the Data Mart layer, per BI user's request.
Is there a technical limitation in Compose to add lookup capability in Data Mart mapping same as it exists in the Warehouse?
Ok here is the problem: take an example that we have 1000 code columns in our model that may have description based on values and the Data Mart requires only 20 of those code columns to require a corresponding description.
With your approach, you are suggesting to create 1000 relationships in our Core model.
Which means on every ETL run, the Compose ETL commands will perform the heavy-lifting of maintaining 1000 relationship IDs by using JOINs. But in reality only 20 of those relationships will be used in the Data Mart for BI consumption.
Thinking in modern cloud terms, that is a lot of waste of compute!
Correlated select runs for every row in the table, which may not be desirable for very high volume facts or dimensions. A lookup using JOINs is much desirable.
Thanks I submitted to the ideas board