Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Community,
I’m looking for some expert guidance on a scenario involving Qlik Replicate (2024.11) reading from SQL Server MS‑CDC.
Environment Overview
Source: SQL Server 2019 (MS‑CDC enabled)
CDC retention: 4 hours
Replicate task type: Log Stream–style task (single parent only, no child tasks)
Tables involved: 2 CDC tables
These tables only receive changes daily between ~20:30–00:00 BST
For the remaining ~20 hours each day, the tables are completely idle
The Issue
This task has been running fine for weeks, but on Jan 25, we suddenly hit a fatal error:
The required LSN '<LSN>' not found. Tables must be reloaded
Failed to get transaction id by lsn
What triggered it was a recoverable error earlier that day caused by an ODBC timeout on cdc.lsn_time_mapping:
SqlState: HYT00 NativeError: 0 Message: Query timeout expired
Endpoint is disconnected
Cannot get the first LSN after time
Replicate auto‑stopped, saved its bookmark LSN, then auto‑restarted a few seconds later.
On restart, it attempted to resume from this saved LSN:
Start source from stream position
But SQL Server CDC had already purged that LSN, so the restart failed with the “Required LSN not found” error and the task fully terminated.
My Confusion
If 4‑hour retention is too short for my workload, then why did this setup work normally on all the previous days?
These tables only write changes at night, so the saved bookmark LSN is often ~12 hours old by the next day.
Yet the task kept running fine — until this one incident.
Key Observations From Logs
The task was running continuously for days with no issues.
On Jan 25 @ 12:52 PM, Replicate attempted a query against cdc.lsn_time_mapping which timed out.
Replicate stopped and marked the event as recoverable:
Endpoint is disconnected
The saved bookmark LSN was from the previous night’s window.
When Replicate auto‑restarted at 12:52 PM, that LSN was already purged from CDC because retention is only 4 hours, causing:
Required LSN not found
On all previous days, we never had a mid‑day restart, so the task never needed to “seek” back to an old bookmark LSN.
My Questions for the Community
Is this expected behavior?
That as long as the task never restarts, it can keep running indefinitely even with retention smaller than the idle window?
Is the main root cause simply the mid‑day retry triggered by the ODBC timeout?
Meaning the CDC retention wasn’t the issue until the exact moment a restart happened during daytime?
Is increasing CDC retention to ≥24 hours (or 48–72 hours) the correct long‑term fix for workloads where tables are silent for long periods?
Is there a recommended ODBC timeout configuration for MS‑CDC sources?
Since the HYT00 timeout on cdc.lsn_time_mapping is what triggered the problematic restart.
Additional Observation — Qlik Enterprise Manager (QEM) Alerting Gap
Despite having “Non‑Recoverable Error” alarm notifications enabled in Qlik Enterprise Manager, no alarm or alert was generated for this incident.
The task initially failed with a recoverable source‑side error (HYT00 timeout → endpoint disconnected).
QEM classifies recoverable faults as “warnings,” not “errors,” so it did not trigger an alarm.
When the task auto‑restarted, the error escalated to a non‑recoverable state (Required LSN not found), but by that time the task had already failed , and QEM did not raise an alert for the transition.
This behavior resulted in the failure going unnoticed until we manually checked the task logs
Any best practices for:
handling MS‑CDC with long idle periods
preventing Replicate from stopping due to transient ODBC timeouts
ensuring task recovery doesn't require reloading tables?
Hi @curious_Crew ,
Old records in the [cdc].[lsn_time_mapping] table are purged according to the configured retention period. If Qlik Replicate cannot locate the last saved stream position in this table, this indicates that some records have already been purged, which can result in data loss. That is why the message advises that the table must be reloaded.
It appears that Qlik Replicate may be unable to move forward due to a query timeout. Since your current retention is set to 4 hours, the changes are removed after that period, which explains the behavior you observed. A "Query timeout" is often treated as a recoverable error and so Qlik Replicate keeps retry without error notification. However, I would need to review your task settings and log files to confirm this in your environment. Please do not upload this information to the forum, as it may contain sensitive data.
Recommendations:
Regards,
Desmond