Qlik Compose CDC ETL Task/Instructions run Performance with Snowflake as target
Currently we have a Qlik Compose for Data Warehouses Project with 106 Entities and growing with Snowflake as our target database
Whenever we run the CDC ETL task, it runs for several minutes (sometime > 10 mins.).
Our generated ETL set has > 5,000 SQL instructions to run.
We observed that sometimes data gets modified for only a few Source Entities out of total 106, but Compose runs all the 5,000 instructions which in turn causes the ETL task to run several minutes.
We are looking for near real-time data refresh in our Data Warehouse and Data Marts.
Can Compose be smarter to recognize the condition "where count(*) > 0" in __CT tables in landing and only run the corresponding mapping related ETL instructions?
I think that will improve performance a lot instead of "always" running 5000 SQL instructions, as only the ETL for modified entities will be run.
Hi @Hrishikesh , phase 1 of reducing / eliminating execution of some statements based on run-time parameters is undergoing development now. I don't have an explicit date - as it depends on dev / testing etc. However, I will update you here when I know more.
Ok thanks for the update, will this eliminate "Populating staging table..." ETL statements and the staging table data processing steps that follow when there is no landing data available in __CT tables?
Compose November 2022 SR (a patch release ontop of 2022.05) has additional filtering of ETL instructions to reduce the churn when no data is present to be processed.