Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!

An Introduction to Qlik Replicate Tasks: Full Load vs CDC

100% helpful (1/1)
cancel
Showing results for 
Search instead for 
Did you mean: 
Michael_Litz
Support
Support

An Introduction to Qlik Replicate Tasks: Full Load vs CDC

Last Update:

Jan 4, 2023 8:59:39 AM

Updated By:

Sonja_Bauernfeind

Created date:

Jan 4, 2023 8:56:20 AM

Qlik Replicate tasks can be configured to move data from a source DB to a Target DB. These tasks can be configured in a variety of ways to move data from the source to the target database.

This article aims to compare three popular task configurations and will provide additional details for the available Full Load settings. It is a supporting document to the Techspert discussion: Full Load.

Available configurations:

 

FULL LOAD only

FULL LOAD tasks are probably the simplest form of a Qlik Replicate task. With a Full Load only task, the data will be copied from the source tables to the target tables (Base Tables) and then the task will stop.

The target tables are basically a mirror image of the source tables with static data based on the last time the task was run.

The target tables are not kept in sync. This type of configuration is very similar to an old-style ETL type job, where the data is copied over one time when the task runs.

This Full Load type of task can be scheduled to run using replicates built-in scheduler or the Qlik replicate command line or rest API.

The task and target tables can utilize some features of Qlik replicate such as transformations and filters.

NOTE: A full load-only task can get around a limitation with LOB columns that are in a table with no primary key.
This limitation is for CDC tasks and the following is the CDC task notification you will get:

Column 'MyLob' was removed from table definition 'dbo.No_PK_LOB': the column data type is LOB and the table has no primary key or unique index

It is required that a table with LOBs has a primary key, due to the way the task will perform a lookup back to the source table to get the LOB value. As this lookup is only done during the CDC phase, it is not a limitation for a Full Load task, which will be able to read the entire LOB column when it is copying the source data to the target table.

 

Full Load settings screen:

 

Michael_Litz_0-1672187591161.png

 

CDC only

CDC only tasks will capture changes from the source and write them out to the target database. In this type of task, there are typically no (Base Tables) in the target.

The Apply changes setting is typically disabled and only the Store Changes setting is enabled. With this type of configuration, the task will read changes from the source and write out the records to the Qlik Replicate Store Changes table also known as __CT tables.

This type of task configuration is typically used when the data in the target __CT table is going to feed a downstream process. The structure of these __CT tables is usually based on the source table and will contain the fields from the source as well as any transformation fields. The big feature of the __CT table is that it also contains the various header information from the source such as commit time, Operation Indicator etc.

 

Store Changes settings screen:

 

Michael_Litz_2-1672187721411.png

 

FULL LOAD and CDC task

FULL LOAD and CDC tasks will typically contain the target (Base Tables) and they will populate those tables during the full load phase of a task.

Then the CDC phase of the task will capture every change to the source records and apply them to the target (Base Tables). This is a very typical configuration for a Qlik replicate task.

Note: this type of task can also have the store changes enabled (as described above in 2)

 

Conceptual vs. Actual task process order

Clarification of an often misunderstood task phase order related to the Full Load and CDC task:

When discussing Full Load and CDC combined in a single task, we think in terms of the Full Load phase starting first and then the CDC phase starting second to apply changes and keep the task in sync. While this is conceptually correct, the phases actually are reversed and in fact, the CDC phase is the one that starts first; capturing and caching changes.

Once the CDC phase starts, then the full load phase will start.

This order is needed in order to insure that any changes made to the table during the Full Load will be captured.

For example, if you happen to have a very large table that takes three hours to fully load and the CDC phase did not start until it was complete, you would miss those three hours of changes being applied to the table.

If you have ever seen a message in the log at the top of the log about consistency timeout this is why:

W: Transaction consistency timeout occurred. x transactions are still open

 

CDC - Apply changes setting screen:

 

Michael_Litz_1-1672187659632.png

 

 

Related Content

Transformation: Source - Filter for Delete
Filter for last 90 days of data in Qlik Replicate
Qlik Replicate Transaction Consistency Timeout occurred. xtransactions are still open
Qlik Replicate Full Load and CDC Split Task: Considerations

 

The information in this article is provided as-is and to be used at own discretion. Depending on tool(s) used, customization(s), and/or other factors ongoing support on the solution below may not be provided by Qlik Support.

 

Environment

Qlik Replicate 

Labels (1)
Comments
eksmirnova
Contributor III
Contributor III

@Michael_Litz  Thank you for the article. This is really helpful. But one thing is not clear to me about FULL LOAD only tasks.

You are saying "The target tables are basically a mirror image of the source tables with static data based on the last time the task was run."

What does exactly that mean "on the last time the task was run"?

Let's say I have a "FULL LOAD only" task with just one table. But the full load takes 4 hours to complete. So my task started at 12:00 pm and stopped at 4:00 pm. And 20 transactions were made between 12:00 - 4:00pm on the source.

Would I have the snapshot of this table on 12:00 pm or 4:00 pm ?

Thank for your answer in advance!

Sonja_Bauernfeind
Digital Support
Digital Support

Hello @eksmirnova 

I have feedback from a subject matter expert for you:

Full Load is just that: no changes will be applied. Having said that, if the source is an MS SQL server, then you do have access to an internal parameter you can set that will read from a snapshot at the time of the Full Load. The parameter is readSnapshotOnUnload. With it, you have a definite point in time and static records that will load.

If you have any other questions regarding this, please post them directly in the Qlik Replicate forums, where our active support agents and knowledgeable Qlik peers are able to help you.

All the best,
Sonja 

Version history
Last update:
‎2023-01-04 08:59 AM
Updated by: