I am working with a data warehouse that is using Qlik Replicate. The architecture is as follows:
Qlik Replicate replicates source data to non-truncated stage tables into target database (stage rows are updated in place)
Data is ETL'd from stage into Data Vault (a middle database layer tracking all changes)
Data is ETL'd from Data Vault to a data mart dimensional schema
I would like to understand if the following is possible in Qlik Replicate:
Qlik Replicate replicates source data to non-truncated stage tables (as above), BUT during replication it detects deltas (changes to pre existing rows and also source table deletes. Where there are deltas to existing rows, , it writes new historically versioned records (start date, expiration date), and marks deleted source rows as soft deletes. This would then become a Persistent Stage Area, and eliminate the need for the Data Vault middle layer.
ETL data directly from Persistent Stage to dimensional schema
If this is possible, it could save hundreds /thousands of hours, eliminating the need for a data vault layer, and all that goes with it. If it's not possible, we could still ETL incremental loads to a separate Persistent Stage area layer which would track all the historical changes. But less than optimal as we would have essentially 2 copies staging data, unless we pruned off the non-persistent stage each month (which we could do).
What you are describing are requirements for Data Warehouse automation functionality add-ons to Replicate. We do have that - Qlik Compose ! Replicate Targets become Compose sources. Compose will leverage the 'Change Tracking' selected feature in your Replicate task to show Deltas - Dave Hecksel I ( I for insert / original record ) - David Hecksel U ( U for update marking it as an update / update of original record on source )
Take a look at this Demo for Compose automating the creation of a Data Warehouse and Data Mart for Analytics Ready data. • Qlik Compose Demo https://www.youtube.com/watch?v=yT9fhJnFx4g ( Agile Data Warehouse Automation: Do in Days and Weeks what takes Months and Years elsewhere ) 14 Minutes long
Compose understands Replicate generated. Because Replicate is delivering real-time data, you inherit a near real-time Data Warehouse. Compose with automate: Creation of the Data Model ( using Datavault style or ingestion of an ERWIN model ), auto generate a set of ELT from that model. Auto generate a 2nd set of real-time ELT SQL. From there you load / update the Warehouse. You have the Warehouse now ( and continuously updating itself say 'every 10 minutes' ). How about a Data Mart? Pick a Fact table in the Warehouse. Compose with auto generate a Data Model ( Star Schema ) and generate both batch and real-time ELT sets for one or more Marts.
Thanks, I think what you are saying is that Compose will use Change data capture (CDC) over stage, and then move that data (inserts, updates, deletes) to a subsequent process to a persistent stage. We do NOT want to use data vault for persistent stage, but simply a versioned copy of replicated stage, which will contain all history. then Persistent stage will be the source for Kimball style dimensional data marts.