Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
There are multiple disparate sources (e.g SAP sales module per region) for a business process (e.g Sales Order ) and the requirement is to build a sales order data mart that provides self-service across all source systems.
Want to understand the solution approach in Attunity compose to automate the process of building data- warehouse (Data Vault) & the data mart layer ?
Sources:
Queries:
Hi Foreverfrnd -
I'm gonna answer #3 first, with some context....
In Compose, I would model the structures to fit my business process. As with most any data warehouse, the central DW entities should be modeled to fit the business as a whole and ETL mappings used to load those structures.
Model Discovery is a great feature to get the model started. In some cases its all you will need, but in cases where you have disparate sources with wildly different structures, the discovered model can (at the very least) bootstrap the model design. You should not discover all sources independently, but start with a source that has the majority of data elements you need and then adjust from there. Its possible you may need to normalize, de-normalize, pivot data or even reverse relationships to structure the model to accommodate your business needs.
I would look at it from a business process perspective. First thing to ask - do you allow an order to have multiple status's and or multiple business partners ?
If so - you definitely need to follow your top example as the target structure to support your business flexibility. If you don't - then either 1 or 2 works.
Whether you follow approach 1 or 2, you will have to create a couple of mappings manually to load the appropriate target structures.
If you take 1 - then you need mappings for the second source system to load the BusinessPartner and SO_status tables from the VBAP table and mappings for the csv files. (You can create views or query based mappings for the csv files that are loaded to the DW).
If you take 2 - then you need mappings for systemA to load what would become businesspartner and status attributes of the order table. (and csv processing).
When you have consolidation of data from disaparate systems, there is a notion of "business logic" that must be applied to the mappings (and oftentimes to the model). Compose will automate the initial creation of the model, mappings and thus etl code - but there are always going to be cases when some manual effort for re-mapping of tables etc is required.
FYI - For the cases where you have the same (or even similar) structures in multiple sources you can duplicate the ETL set and define the new source schema for the ETL mappings. Compose will duplicate the mappings, using the new landing schema as the source . See section 9 in the user guide (Creating and Managing the Data Warehouse >> Adding and Duplicating ETL Sets).
Hope that helps.
Thats the long answer - the short answer to your specific questions -
1. Compose requires a source model only if you wish it automates the entire dw process. As explained above, this is not always the case as there are conditions where business logic must be implemented. What Compose requires is a MODEL to define the structure of the DW.
2. Unification of the sources is done via creating a canonical DW model and using ETL mappings to consolidate the multiple sources. So Compose doesn't need a canonical ODS - but in these cases not everything can be automated (nor should it be)
Thanks for the detailed explanation.Will create a source unified 3NF data model by extending source 1 (take 1) data model. However, have below follow-up questions. Request your help.
1. Correct. ODS is (IMO) an overloaded term with different meanings. Often it represents a consolidated view of the data - but you'll also see architectures where an ODS is just a replicated copy of the source. I've also seen a number of data warehouses in my data where there are no requirements for Type 2's at all. Semantics aside - Compose can handle what you want to do 🙂
2. In Compose you can use the Discover process to bootstrap the data model and adjust from there. As you've said - there's no such thing as a free lunch when it comes to understanding your business and when it comes down to harmonizing data between multiple sources that are structured completely differently there will always be some level of business acumen and manual adjustment. In my experience designing the model for an EDW is not the most time consuming part - its coding the ETL that has always been the longest pole in the tent. Even when creating a custom model, Compose provides automation for the ETL and mappings can be uploaded via our csv import mechanism to reduce the amount of manual effort in the logical mapping process. (Remember that Compose will auto-gen the technical aspects of ELT for you.
3. Not sure I understand the question. Perhaps it would be best to have a discussion on what you are trying to do and how you may accomplish it in Compose ? (I find these deeper modeling questions are typically easier to discuss in person). If you are open to that - I will dm you my email and we can connect in-person ?