Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
JacobTews
Contributor III
Contributor III

How do I handle a duplicate concatenated primary key coming from the source?

We are replicating tables from a DB2 source ERP system to a SQL Server database. One table comes in fine during a full load, but does not play nicely with CDC. The issue seems to be that the source is case-sensitive, so it sees no issue with item tags of 'P50137' and 'p50137' coexisting, but when Replicate tries to handle the table, I get the following error:

Execute create Primary Key/Unique Index failed, statement ALTER TABLE [MW4FILE].[ITEMTAG] ADD CONSTRAINT [ITEMTAG_Q_MW4FILE_ITEMTAG_ITDIST_00001] PRIMARY KEY ( [ITEMDISTRICT], [ITEMNUMBER], [TAGNUMBER], [BUILDUPSEQ] )

RetCode: SQL_ERROR SqlState: 23000 NativeError: 1750 Message: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Could not create constraint or index. See previous errors. Line: 1 Column: -1

RetCode: SQL_ERROR SqlState: 23000 NativeError: 1505 Message: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'MW4FILE.ITEMTAG' and the index name 'ITEMTAG_Q_MW4FILE_ITEMTAG_ITDIST_00001'. The duplicate key value is (14, 43092, p50137, 0). Line: 1 Column: -1

Failed (retcode -1) to execute statement: 'ALTER TABLE [MW4FILE].[ITEMTAG] ADD CONSTRAINT [ITEMTAG_Q_MW4FILE_ITEMTAG_ITDIST_00001] PRIMARY KEY ( [ITEMDISTRICT], [ITEMNUMBER], [TAGNUMBER], [BUILDUPSEQ] )'

The data can't/won't be fixed in the source, so I am hoping Replicate has a setting that can handle it without suspending the table, which is what happens currently after the full load. If need be I could just set a separate task for this table to rerun a full load on a schedule, but I'd like to believe that there's a more elegant solution.

Any ideas?

Qlik Replicate

Labels (1)
5 Replies
john_wang
Support
Support

Hello @JacobTews ,

Thanks for reaching out!

If the source DB2 is case sensitive then the target should be as well, fortunately we can do that easily in SQL Server, for example create a database with case sensitive collation CS specifies case-sensitive

Hope it helps.

Regards,

John. 

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!
JacobTews
Contributor III
Contributor III
Author

Thanks, @john_wang!

If I'm unable to update the table collation in the target, is there a way to use the Filter tool in the Table Settings menu to choose just one of the two records to replicate, ignoring the other? I've tried what's outlined in the documentation to no avail: both records still come in on full load of the table, then Replicate suspends the table because of the error.

Also, FWIW, I set up a new task to just run a daily full load of the table in question, and that fails, too.

Dana_Baldwin
Support
Support

Hi @JacobTews 

You might be able to set a filter using the SQL Server "lower" function. Here's a couple links with examples of how to use filters:

Filter for last 90 days of data in Qlik Replicate - Qlik Community - 1880761

Filter for last 90 days of data in Qlik Replicate ... - Qlik Community - 1980009

Hope this helps

Dana

Heinvandenheuvel
Specialist II
Specialist II

What do you mean with "duplicate concatenated" key in the subject. I do not see a reference to that in the description. What is it, why do you think it is relevant?

You do realize that you want to make SQLserver + Replicate make an arbitrary decision on which row to keep right?! It will be the one with the 'lower' case sensitive sorting - that uppercase and the lowercase will never be seen. Is that acceptable to the business user on the target? You may be able to get the target business users to either realize that they really need a case sensitive target to avoid selection decision or have them turn around to the source folks and instruct them to fix the source . You as an implementer should not try to decide. You taks is to point out the issue and let other make a political/policy  decision because that's what this is.  This is not a technical question.

Now you could try a technical hack/solution by studying ALL the duplicates in question. Google is your friend in finding the SQL to find the duplicates. Is there another field which would make them unique? Perhaps a creation-date? Or a username? Add those to the target PK but please realize you now just push the issue to the target as the end user will now know how to deal with duplicates when just using the main PK segment.

It seems to me the task is using "full-load settings" - "Create primary key or unique index after full load completes"

This is not the default, you picked that. (why? Did you measure the difference or just on gutfeel? You just lost all the advantage it might give struggling with it). Without that choice the duplicates will not 'break' the load but create a warning best I recall (I may recall this wrong)  This is all easy enough to test with a small table on the source with the bad conditions in place. YOU should test this such that you fully know what to expect.

Best is probably to change the target DB collating sequence to fix the (imho bad case insensitive default MS SQL uses to 'help' non-power users). You can  also change the change the collation sequence for the specific table but now you must make sure to NOT use DROP + CREATE on reload, but truncate to keep the setting.

btw... you can also change the SQL Syntax used by replication to alter the collating sequence - but I'd suggest Professional Services support for that

Thanks for posting the complete error as text, not as picture. You might want to post (critical elements liek task settings only from) the task JSON as well in an attachment. We don't need to see all the included tables, just a bit to get the idea.

john_wang
Support
Support

Hello @JacobTews ,

Looks to me @Heinvandenheuvel made the best explanation to the issue. BTW, even we can make the 'duplicate rows' passthrough from source to target (eg set the task to UPSERT mode), from apps data point of view it leads a data missing because the 2 different rows in source become 1 single row in target.


@JacobTews wrote:

If I'm unable to update the table collation in the target...


I'm not sure how you did but please take note that in SQL Server the collation is a bit different with other RDBMS. In SQL Server there are only 2 level collations: database-level, and column-level. (there is NOT table-level collations). When you create or alter a table, you can specify collations for each character-string column by using the COLLATE clause. If you don't specify a collation, the column is assigned the default collation of the database.

Regards,

John.

Help users find answers! Do not forget to mark a solution that worked for you! If already marked, give it a thumbs up!