Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
With an Azure SQL (MS-CDC) source database, we're seeing the task fail with a recoverable error once a day or once every two days.
These are the errors we're seeing:
Failed (retcode -1) to execute statement: '
select top 1 start_lsn as[CommitLsn], CONVERT(varchar, tran_begin_time, 21) as[BeginTime], CONVERT(varchar, tran_end_time, 21) as[EndTime],
tran_id as[TranId], tran_begin_lsn as[BeginLsn]
from[cdc].[lsn_time_mapping]
where start_lsn = (select min(start_lsn) from[cdc].[lsn_time_mapping] where tran_end_time >= CAST ( ? AS DATETIME2(0) ) and tran_id < > 0x00); ' [1022502] RetCode: SQL_ERROR SqlState: 08S01 NativeError: 10060 Message: [Microsoft][ODBC Driver 18 for SQL Server]TCP Provider: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. [1022502] Cannot get the first LSN after timeEndpoint is disconnected - select "__$operation", tran_id, tran_end_time, [Id], [SinkCreatedOn], [SinkModifiedOn], [statecode], [statuscode], [business_eventstatuscode], [business_isagreedresponse], [business_isbandtowreceipted], [business_isbatterysale], [business_isgeneratedockets], [business_isinterimevent], [business_ismanualevent], [business_isrechargeevent], [createdby], [createdby_entitytype], [createdonbehalfby], [createdonbehalfby_entitytype], [modifiedby], [modifiedby_entitytype], [modifiedonbehalfby], [modifiedonbehalfby_entitytype], [owningbusinessunit], [owningbusinessunit_entitytype], [owningteam], [owningteam_entitytype], [owninguser], [owninguser_entitytype], [business_cpsmemberclubid], [business_cpsmemberclubid_entitytype], [business_cpsproductid], [business_cpsproductid_entitytype], [business_cpsprogramid], [business_cpsprogramid_entitytype], [business_vendoragreementid], [business_vendoragreementid_entitytype], [ownerid], [ownerid_entitytype], [business_customerid], [business_customerid_entitytype], [createdbyname], [createdbyyominame], [createdon], [createdonbehalfbyname], [createdonbehalfbyyominame], [importsequencenumber], [modifiedbyname], [modifiedbyyominame], [modifiedon], [modifiedonbehalfbyname], [modifiedonbehalfbyyominame], [overriddencreatedon], [owneridname], [owneridtype], [owneridyominame], [owningbusinessunitname], [business_cpseventid], [business_cpsmemberclubidname], [business_cpsproductidname], [business_cpsprogramidname], [business_customerididtype], [business_customeridname], [business_customeridyominame], [business_eventbeat], [business_eventcommentsjson], [business_eventdetailsjson], [business_eventnumber], [business_eventsubtype], [business_eventtype], [business_occurredon], [business_vehicleregistrationnumber], [business_vendoragreementidname], [timezoneruleversionnumber], [utcconversiontimezonecode], [versionnumber], [IsDelete]
from [cdc].[dbo_business_cpsevent_CT] WITH(NOLOCK)
join [cdc].[lsn_time_mapping] WITH(NOLOCK) on start_lsn = __$start_lsn
where [**$start_lsn] between 0x000031C0002685F00001 and 0x000031C000321FC00034 ORDER BY [**$start_lsn] ASC, [**$command_id] ASC, [**$seqval] ASC, [__$operation] ASC
Can anyone give any clue as to why this is happening? It would be greatly appreciated. Thanks.
Kind regards,
Mohammed
Thanks @MoeE
You're on an early version of 2024.5 SP3 (2024.5.0.563) which is the latest available. I don't see any relevant fixes in the official SP3 so I can't say whether it would have any effect or not. There are a couple MS-CDC fixes in 2024.11 but they don't align with this issue.
This query most likely runs often to identify new changes. Since I don't see any documented fixes for this query to make it run faster, and it seems intermittent, I would try to rule out an environment issue first. If more help is needed after that you can update this post or open a support case (which might be better as we can elevate issues to our internal support via a case, where we cannot via Community).
Thanks,
Dana
Hi @MoeE
The first error points to some type of network connectivity issue.
The second issue might be due to the same reason - but I would check with your DBA and Azure to see if the explain plan for that query can be improved.
What is the full version number of Replicate you are using? With that, we can check for fixes/enhancements to that query in later versions.
Thanks,
Dana
Hi Dana,
Thanks for the quick response. The QR version is V2024.5.0.485.
We can try running the explain plan/execution plan. Do you think the query itself could be running for hours? Do you know why this would happen?
Regards,
Mohammed
Thanks @MoeE
You're on an early version of 2024.5 SP3 (2024.5.0.563) which is the latest available. I don't see any relevant fixes in the official SP3 so I can't say whether it would have any effect or not. There are a couple MS-CDC fixes in 2024.11 but they don't align with this issue.
This query most likely runs often to identify new changes. Since I don't see any documented fixes for this query to make it run faster, and it seems intermittent, I would try to rule out an environment issue first. If more help is needed after that you can update this post or open a support case (which might be better as we can elevate issues to our internal support via a case, where we cannot via Community).
Thanks,
Dana
Hi Dana,
No worries, thanks for the assistance. Much appreciated.
Regards,
Mohammed
Hello Mohammed, @MoeE
In addition to @Dana_Baldwin 's comments, please try running the query manually in the SQL Server database using the same account. This will help confirm whether the behavior is reproducible outside of Qlik Replicate.
From my perspective, the issue doesn’t appear to be related to Replicate itself. Let’s first isolate the problem, and then, if needed, we can look into adjusting the timeout settings or other configurations. Anyway, If a query runs for several hours, it usually indicates that the SQL Server is under heavy load or that there may be an issue with the database or the server itself.
Good luck,
John.
Hi John,
Yep, good idea. I've already asked the client to try this and waiting to hear back. Thanks for the helpful input.
Regards,
Mohammed
Hi Dana,
Just to update this for future reference, you were right. It looks like our client had lots of operations occurring on their source database as part of a script that was running (doing updates, inserts, deletes, etc).
After the script was completed they did not experience the issue again. So most likely it was to do with their source database overhead/resources.
Regards,
Mohammed