Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
We have a log stream and child tasks are continuously replicating delta changes.
Is it possible to stop replication and select rows that were discovered out of sync, filter them reload them, and then resume replication? We were notified by the business that one customer's data looked unmatched in the source database. We would like to choose that customer (7 records) and reload to update the missing changes without interrupting the daily changes and throwing constraint errors if data already exists in the target database.
The target table has historical data so the table cannot be truncated to reload the missing changes.
Could you please confirm whether that is possible and provide instructions on how to proceed?
Thanks,
AU
Hello @Attunity_user
Yes, you can follow this approach:
Stop the Logstream Child Task
Once the task is stopped, note down the final saved stream position. It should look something like this:
[SORTER ]I: Final saved task state. Stream position 00054a9d:000a5ae8:0003, Source id 94638621, next Target id 55588857, confirmed Target id 55588697, last source timestamp 1721610573541116 (sorter.c:781)
Note: Pay attention to the last source timestamp
(e.g., 1721610573541116
).
Convert the Timestamp
Use an epoch converter to convert the source timestamp (1721610573541116
) into a human-readable format. below is an example
Task Setting
Set the task to:
Turn Off Change Processing
Disable the change processing and then add your filter for the seven specific records.
Initiate the Load
Start the load process. Once the full load is completed, stop the task.
Re-enable Change Processing
Enable change processing for the task again and restart it from the timestamp you obtained after converting it using the epoch converter.
In my case its "Your time zone: Monday, July 22, 2024 6:39:33.541 AM
"Its always recommended to give 15 mins of buffer time , example start it from 6:20 AM in above case
This process should ensure that your CDC resumes from where it left off. Just make sure the parent task remains running throughout this process and does not stop.
Sometimes it is enough to just perform a benign update "update set X=X where PK in (...)" on the (7?) source rows with issues to push them back to the targets. For that they need to pre-exist or the task error handling set to 'ignore missing rows on update'.
Or just update the targets outside Replicate and hope it is not a structural problem.
Hein.
Hi Hein,
Can you please provide the steps with an example? If there are records in the target with a few non-PK column mismatches, how do we add the benign update statement for those rows with the PK column?
When I tried to reload one customer, it did not work and failed with a constraint error.
Here are the steps I took.
Situation: Customer AB (7 records) is mismatched in a few columns in the target table.
Without truncating or deleting rows in the target database, I need to perform a reload for that customer to update a few column mismatch and not impact the CDC changes.
1. Stopped the task and deactivated CDC. Change the full load settings to 'do nothing'.
2. In the task setting, choose the Customer filter (string) and include the customer number value.
3. Disable the constraint in the Target database table.
4. Click on Reload.
Result: The task threw a constraint error and stopped, although the apply conflict was already in place for Insert/Update (Duplicate key when applying INSERT/No record found for applying an UPDATE).
Is there a solution to reload using the Qlik replicate task to only reload the older records and automatically update rows in the target using the PK column? Maybe the Qlik replicate task cannot be used for scenarios like this to reload selective rows to apply updates but I am looking for confirmation if that is true.
Thanks,
AU
Hi Aarun,
I followed the given steps. It did not work and failed with a constraint error.
Here are the steps I took.
Situation: Customer AB (7 records) is mismatched in a few columns in the target table.
Without truncating or deleting rows in the target database, I need to perform a reload for that customer and not impact the CDC changes.
1. Stopped the task and deactivated CDC.
2. In the task setting, choose the Customer filter (string) and include the customer number.
3. Disable the constraint in the Target database table.
4. Click on Reload.
Result: The task still threw a constraint error, although the apply conflict was already in place for Insert/Update (Duplicate key when applying INSERT/No record found for applying an UPDATE).
Is there a solution to reload using the Qlik replicate task to only reload the older records and automatically update rows in the target using the PK column?
Is it possible to perform reload for a few rows to update records without throwing constraint error?
It only worked when those mismatched records directly deleted from the target database table. In that case, the records were reinserted. That is risky and would like to avoid it in production.
Thanks,
AU
Hi Aarun,
I followed the provided steps, but unfortunately, it failed with a constraint error. Here's what I did:
Scenario: Customer AB (with 7 records) has mismatched data in some columns in the target table.
Objective: I need to reload the data for Customer AB into the target table without deleting or truncating rows in the target database and without impacting CDC changes.
Stopped the task and disabled CDC.
Applied a customer filter in the task settings, specifying the customer number (string) to include only Customer AB.
Disabled constraints on the target database table.
Initiated the reload process.
Even with these settings, the task threw a constraint error. This happened despite having conflict handling in place for "Insert/Update" (i.e., handling duplicate keys during INSERTs and missing records during UPDATEs).
I want to reload only the older records for Customer AB while ensuring that the PK column is used to automatically update the target records.
The only way I was able to resolve this issue was by deleting the mismatched records from the target database table. After doing this, the records were successfully reinserted during the reload. However, this approach is risky and not feasible for a production environment.
Is there a way to perform a targeted reload for specific rows, ensuring they are updated in the target table without causing constraint errors? If so, how can this be achieved using a Qlik Replicate task?
Hello @domenic85martha ,
You may new a Temporary Task (even without LogSteam involved) to operate these 7 rows individually (the existing task can be kept running in the meanwhile).
Hope it helps.
John.
Hi @domenic85martha ,
The steps outlined above is a Full Load process. Full Load does not handle conflict records. Conflict handling for "Insert/Update" applies only to CDC. Therefore you need to delete target record manually.
Regards,
Desmond
Based on all responses received, is it fair to say that when there is a gap or mismatched records in the target table for whatever reason, while the CDC is running, without deleting records from the target table in the production environment, the current log stream task or any temporary tasks cannot be utilized to backfill them? In other words, only a full (with truncate) load option is available if deleting records directly from the table is not permitted. Because the task will throw a constraint error if mismatched records exist in the target table. Full load configuration will not allow inserting/updating records. This is a limitation for Qlik replicate.
Thanks,
AU