Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik and ServiceNow Partner to Bring Trusted Enterprise Context into AI-Powered Workflows. Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
Al_gar
Creator
Creator

Handling duplicate records on a full load

For Qlik Replicate December 2024 (2024.12.0.21) November 2024 (2024.11.0.177) using HANA as a source and SQL Server 2019 as target, we're having issues handling duplicate records on a full load. We've tried the steps posted in the technical article "Handling Duplicate Rows when doing a Full Load in Replicate" to add --FILTER: 1=1) ORDER BY Col1, Col2 -- in the Table Settings -> Filter -> Record Selection Condition, but our task keep failing when there are duplicate records.

Labels (2)
1 Solution

Accepted Solutions
Al_gar
Creator
Creator
Author

@john_wang,

Since our issue was impacting only the full load for a particular table we ended up with an alternative workaround. We configured the task as full load with CDC and we removed the PK under table settings > transformation. We started the task and once the table finished the full load we then removed the duplicate keys and added the PK constraint in the target database. The Replicate task kept loading changes as intended and we didn't had to do any other major change.

View solution in original post

8 Replies
john_wang
Support
Support

Hello @Al_gar ,

It seems I'm not familiar with version 2024.12.0.21. Could you please share which software provider it’s from, or let me know the platform on which you're running this specific version?

Thanks,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
Al_gar
Creator
Creator
Author

My apologies @john_wang, the correct version is  Qlik Replicate November 2024 (2024.11.0.177) and it runs on a Windows Server 2019.

john_wang
Support
Support

Hello @Al_gar ,

Thanks for the clarification.

I'd like to clarify the basic scenario:

  1. Duplicate Rows During Full Load:
    If duplicate row(s) are inserted or updated during the Full Load phase, note the following:

    • If these rows are inserted after the Full Load has started, they may be replicated to the target as part of the Full Load (i.e., their first replication).

    • After the Full Load completes, the cached changes from Change Processing will be applied to the target (i.e., their second replication).
      This could result in duplicate rows being inserted into the target. Especially if the source endpoint database does not support snapshot.

  2. Test Without Change Processing:
    Try disabling the Change Processing option to determine whether the issue occurs during Full Load ONLY task. This can help isolate the problem.

  3. Stop Task Before Applying Cached Changes:
    Set the task option to "After Full Load completes, stop the task""Before cached changes have been applied", and observe the results. This will help verify whether the duplicates are introduced by the cached changes.

Regards,

John.

 

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
Al_gar
Creator
Creator
Author

I tried using full load only and this is the error message:

The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'SAPABAP1.ZTABLE' and the index name 'ZTABLE_ZTABLE_1750868909410735_PK'. The duplicate key value is (000, FD1123456 (2000x), X1234, 00). Line: 1 Column: -1; Execute create Primary Key/Unique Index failed, statement CREATE UNIQUE INDEX [ZTABLE_ZTABLE_1750868909410735_PK] ON [SAPABAP1].[ZTABLE] ( [COL1], [COL2], [COL3], [COL4] )

 

 

john_wang
Support
Support

Hello @Al_gar ,

You've made good progress so far.
Please take a closer look at the row with the unique index values: 000, FD1123456 (2000x), X1234, 00.

It appears that more than one row exists in the target table for this key.
Kindly compare it with the source data to identify any discrepancies and understand the root cause.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
Al_gar
Creator
Creator
Author

Hi @john_wang,

Source system collation is case sensitive while Target is not. There should be a unique value, but since we're dealing with a DEV system we found the some "duplicate" records for this particular table:

000, FD1123456 (2000x), X1234, 00
000, fd1123456 (2000x), X1234, 00

Can you confirm if this is possibly why Handling Duplicate Rows when doing a Full Load in ... - Qlik Community - 1712074 won't work in this scenario?

Thanks.

Al_gar
Creator
Creator
Author

@john_wang,

Since our issue was impacting only the full load for a particular table we ended up with an alternative workaround. We configured the task as full load with CDC and we removed the PK under table settings > transformation. We started the task and once the table finished the full load we then removed the duplicate keys and added the PK constraint in the target database. The Replicate task kept loading changes as intended and we didn't had to do any other major change.

john_wang
Support
Support

Hello @Al_gar ,

Thanks for the update.

The article you mentioned is used while there is aching changes which were made during full load. however it's a different scenario in your project.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!