Skip to main content
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
Showing results for 
Search instead for 
Did you mean: 
Creator III
Creator III

What is the software difference between Compose for Data Warehouse vs for Data Lakes?


What is the software difference between Compose for Data Warehouse vs for Data Lakes? We have 6.5 version.

From design point of view DW is for structured data, while data lakes are for structured and unstructured data but I don't understand what the difference between the software options for Attunity.

Labels (1)
4 Replies

To add to Trafoosss answer -

For Compose for Data Lakes - consider the architecture of a data lake.  It is often designed with multiple "zones" for example the medallion method where you have Bronze [data that looks just like the source] --> Silver [some transformation / dq rules applied etc] --> Gold [fully curated, transformed datasets].   

Compose for Data Lakes operates very much in the bronze layer of the lake.   Understanding how Qlik Replicate delivers change data to the lake and automating the process of transactional data management in the lake.   It generates spark/hive/sparksql (depending on compute environment) to "stitch" the transactions while also enabling certain schema evolution features.    Further transformation / curation /dq of the data would be performed downstream from the datasets that Compose for Data Lakes generates.   Compose for Data Lakes is built to operate in a "traditional" lake environment.  Think S3, ADLSGen2, Google Cloud Storage, HDFS with compute layers of EMR/ databricks / HDInsight / DataProc.


Compose for Data Warehouse provides end-to-end data warehouse life-cycle management against a relational data warehouse platform (think Snowflake, Redshift, Azure Synapse, Oracle, SQL Server).  It provides features to help manage and automated the entire dw lifecycle - Modeling, ELT mappings with automated code generation, Data marts, documentation etc.   Compose for DW provides more transformation / quality / data validation features that Compose for Data Lakes -because of the arena in which it operates. 


Having said that Compose for Data Warehouse can be used to build "relational" lakes also (Snowflake for example is becoming a very popular platform to support both lake and warehouse workloads). 


Hope this helps!


Hi Mwallman,

Compose for Data Lakes is also for structured data files that are ingested on Cloud Storage. Such as S3,ADLS gen2, GCS or HDFS. Compose for Data Lake creates a full standardized history of the data utilizing a Compute engine such as Spark or Hive. The standardized history data is snappy compressed and in parquet format within a data lake storage zone. Depending on the compute engine vendor. Example (EMR,HDInsight,Databricks,etc.) Provision data sets can be created in the data lake storage bucket and target folder of choice. Provisioned data sets are Historical, Operational data sets, Snapshots that are generated off the standardized  data in the storage zone. ( Depending on Compute vendor some provisioning types may not be available.)

Compose for Datawarehouse is for datawarehouse targets. Example ( Synapse, Redshift, Snowflake, etc.) You can utilize data replicated to a datawarehouse target utilizing Qlik Replicate. To create and automate creation of a datawarehouse and data mart on that datawarehouse target. Data captured on the target from the Qlik Replicate CDC process will than be loaded into datawarehouse schema and data mart through the Compose for Datawarehouse workflow. ( Data in schemas on the target outside of the Replicate process can be loaded into the Datawarehouse and data mart schema as well.)


in below link you can see the detailed explanation btn data lake vs datawarehouse



Hello team,


If our response has been helpful, please consider clicking "Accept as Solution". This will assist other users in easily finding the answer.



Sushil Kumar