Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
SumitSingh
Partner - Contributor III
Partner - Contributor III

Incremental load is not supported For Aggregated datamart.

My requirement is to create additional aggregates tables on the basis of existing table column in datawarehouse.For this I have created aggregated table in datamart as we have a feature to use in Datamart. The problem with aggregated mart is it doesnt't support incremtal load . We always have to do a full load instead. And my data is already loaded in the datawarehouse and it is in huge amount in crores. So for every time it is not feasible to do a full load.

Is there any work arround so we can create an aggregation table and can load data incrementally?

Labels (2)
10 Replies
avinashelite

Hope you have the partition column in your data warehouse and your could base your incremental load based on the partition columns . 

SumitSingh
Partner - Contributor III
Partner - Contributor III
Author

@avinashelite My problem here we don't have incremental type load in datamart. When I generally load data incrementally in my  Warehouse table for that we have a CDC task , running CDC task for table will do my job for warehouse table. In datamart we create aggregated table based on warehouse table but the challenge here is when we run the datamart table task it always does a full load and not incrementally.

avinashelite

Okay got you , do you have any primary key or date column in your aggregated table 

SumitSingh
Partner - Contributor III
Partner - Contributor III
Author

Yes I have a primary key and date field in my warehouse table . In my case sequence number is taken as Primary key . In datamart we are creating monthly aggregate using the datefield  in datamart.

avinashelite

Good !! then you could use the Primary Key or date for the incremental loading rite .

e.g. Till Jul 6th data key is loaded till 1000

next day you fetch the max key from the QVD and frame a query where key>1000 and fetch only the key records similar you could base it on the date column also 

SumitSingh
Partner - Contributor III
Partner - Contributor III
Author

@avinashelite  Yes I do this as you have said on Qliksense end. But I'm looking for a solution on Qlik Compose for datawarehouse end.

TimGarrod
Employee
Employee

@SumitSingh  -  Currently, as you've said, Aggregate and State-Oriented data marts are full rebuilds only when using Compose's data mart automation.    Aggregates aren't incrementally refreshed becuase when refreshing an aggregate incrementally, Compose would have to ensure it knows current and prior values of any attribute you are aggregating on which is not possible for any Type 1 attribute (for example you are aggregating by ProductType and ProductType changes for a number of records, Compose would have to know and aggregate the prior ProductType values and the current ProductType values to know which records in the aggregated fact change).   

 

Having said that, you can do the below to work-around this.   

  • Create your Aggregated Data mart
  • Apply a filter using a date / year / month etc. range  that encompasses the data set you wish to refresh (typically when rebuilding aggregates we only need to rebuild the last month / 6 months / year etc.).   
    • I like to use a control table to manage this filter as it allows me to change the values without requiring a code deployment. 
    • eg -  Create table Agg_Filters (FactTable varchar(100), MinFilterValue varchar(50), MaxFilterValue(varchar(50)) 
    • Then the filter would be something like -   
    • YEAR(${DateColumn}) >= (SELECT MinFilterValue FROM AggFilters WHERE FactTable ='MyFactTable') AND YEAR(${DateColumn}) <= (SELECT MaxFilterValue FROM AggFilters WHERE FactTable = 'MyFactTable')
  • Create a copy of the FactTable structure and provide it a new name (MyFactTable_XXX)
  • Create a Post-Load step to delete from MyFactTable_XXX for the date range being processed and then INSERT into MyFactTable_XXX select * from MyFactTable

Using this method allows you to leverage the automation in Compose to recreate whatever segment of the aggregation that is required to be reloaded and refresh just that section of the manually managed table

On the first run - have Compose load the entire aggregate (set the MinFilterValue and MaxFilterValue to the entire range of values) .  On secondary runs set appropriate Min and Max filter values so you refresh just the segment of the aggregate that you wish to refresh. 

 

Of course  this does require knowing which "ranges" of data you wish to refresh each day.  If you don't, and you want the data set rebuilt with each run - you can follow the same process and leverage renaming / truncating / cloning (depending on your target DW env.) in the Post-Load step in order to reduce any data downtime for consumers. 

 

Hope this helps!

SumitSingh
Partner - Contributor III
Partner - Contributor III
Author

@TimGarrod  Month and Year functions are not working in compose with oracle database . for eg when I write a function :

YEAR(${DateColumn})  it gives an error  ORA-00904 :"YEAR":Invalid Identifer.

It is same with month function too.

How to derive Month And Year from date column in oracle ?

TimGarrod
Employee
Employee

You need to use Oracle functions. Year / Month we’re an example it those are SQL Server functions. As an ELT solution, Compose leverages the functions of the underlying data warehouse - in your case Oracle.

So for Oracle you can use the EXTRACT function.