Qlik Replicate: Implementing UPSERT and MERGE modes by applying a Conflicts Handling Policy

Michael_Litz · Feb 3, 2021 7:51:49 PM

Note: The concept ’ UPSERT MODE’ and 'MERGE MODE' is not documented in the User Guide. i.e. it is not a word you can search for in the User Guide and is not a key word in the Replicate UI.

What they do:

UPSERT MODE: Change an update to an insert if the row doesn't exists on the target
MERGE MODE: Change an insert to an update if the row already exists on the target

How to set

Use MERGE MODE: i.e. configure the task under: task setting --> Error Handling --> Apply Conflicts --> ‘Duplicate key when applying INSERT:’ UPDATE the existing target record

Use UPSERT MODE: i.e. configure the task under: task setting --> Error Handling --> Apply Conflicts --> ‘No record found for applying an UPDATE:’ INSERT the missing target record

How they work:

Batch Apply and Transactional Apply modes:

There is a big difference in how these Upsert/Merge settings work depending of whether the task is in 'Batch' or 'Transactional' Apply mode.

Batch Apply mode:

Either option (Upsert/Merge) does an unconditional Delete of all rows in the batch, followed by an Insert of all rows.

Note: The other thing to note is that with this setting the actual update that fails is inserted in a way that may not be obvious and could cause issue with downstream processing. In batch apply mode the task will actually issue a pair of transactions (1st a delete of the record and then 2nd an insert) this pair of transactions is unconditional and will result in a "newly inserted row every time the record is updated on the source.

Transactional Apply mode:

Either option (Upsert/Merge) - the original statement is run and if it errors out then the switch is done (try and catch).

Insert in transactional apply mode, the insert statement will be performed in a "try / catch" fashion. The insert statement will be run and only if it fails will it be switched to an update statement .

In transactional apply mode, the update will be performed in a "try / catch" fashion. The update will be run and only if it fails will it be switched to an insert statement .

Gotchas:

If the table has added fields that are used/modified by a downstream process then there is a good chance that this will cause a problem.
Oracle table must have supplemental logging all column set
Masking of possible underlying issue - why was the record not there for an update or why was it there already for and insert
I want to caution you that there may be other missing records (from a failed insert) that were not captured and this setting will only bring over missing records if they are updated on the source. There could be many records still missing on the target.

Possible use case:

Missing values in some fields/missing rows on target. With Upsert/Merge set a "fake" update can be issued on the source (Update field1 = field1) without a where clause will force every row in the source to refreshed to the target ( without losing all the target data like a reload would)
Prevent a task from switching out of bulk apply due to errors 'Duplicate PK or Record not found (appliance targets could benefit from this)
Could help the CDC only task of split table environment - Separate Full Load / CDC design.

The information in this article is provided as-is and to be used at own discretion. Depending on tool(s) used, customization(s), and/or other factors ongoing support on the solution below may not be provided by Qlik Support.

Environment

Qlik Replicate

jamshed · ‎2024-10-06

Will it work without CDC ? i am trying but not working without CDC, kindly suggest me if you know any there ways to resolved this

john_wang · ‎2024-10-06

Hi @jamshed ,

This process is designed for Change Data Capture (CDC) only. It does not apply during the Full Load phase.

For the initial load, there are four options available:

DROP and CREATE Table: This will drop the existing table and recreate it before loading data.
TRUNCATE Before Loading: The existing table's data is cleared, but the table structure remains intact.
ARCHIVE and CREATE Table: The table is archived, and a new one is created before loading data. (Currently this option is only available for the Hadoop target endpoint.)
Do Nothing: Existing data and metadata of the target table will not be affected. New data will be added to the table.

If the initial load is interrupted and the task stops before completion, the next run will restart the entire initial load from the beginning.

Regards,

John.

Qlik Replicate: Implementing UPSERT and MERGE modes by applying a Conflicts Handling Policy