Skip to main content
Announcements
Product Release Webinar: Qlik Insider airing December 6! REGISTER TODAY!

Qlik Replicate: Addressing Missing Records in CDC When Loading Data from Salesforce to Snowflake

100% helpful (1/1)
cancel
Showing results for 
Search instead for 
Did you mean: 
SwathiPulagam
Support
Support

Qlik Replicate: Addressing Missing Records in CDC When Loading Data from Salesforce to Snowflake

Last Update:

Sep 27, 2023 1:40:32 AM

Updated By:

Sonja_Bauernfeind

In the data integration process from a Salesforce source to a Snowflake target, it's crucial to ensure all records are accurately captured. However, sometimes gaps can occur in Change Data Capture (CDC) operations, resulting in missing records.

In this article, we aim to explore such an issue and guide you through the required troubleshooting steps and eventual solution.

Issue

One of the source table in Salesforce is 'WorkOrder,' and discrepancies in data capture were observed in the logs. The root of the problem becomes evident when examining consecutive SELECT queries:

Query 1:

Timestamp: 2023-07-17 13:41:04.251

Query: SELECT count() FROM WorkOrder WHERE SystemModstamp > 2023-07-17T17:24:28.000Z and SystemModstamp < 2023-07-17T17:40:04Z

Query 2:

Timestamp: 2023-07-17 13:56:31.411

Query: SELECT count() FROM WorkOrder WHERE SystemModstamp > 2023-07-17T17:41:04.000Z and SystemModstamp < 2023-07-17T17:55:31Z

Upon closer examination, it's evident that the tracking column 'SystemModstamp' in Query 2, which is "> 2023-07-17T17:41:04.000Z," does not align with the previous query's 'SystemModstamp' value, which was "< 2023-07-17T17:40:04Z."

Consequently, any changes within this time period (SystemModstamp between '>= 2023-07-17T17:40:04Z' and '<= 2023-07-17T17:41:04.000Z') were not captured, leading to missing records.

Resolution

After a thorough analysis, the root cause of this issue was identified. The transaction consistency timeout of 60 seconds was found to be inadequate. To address this problem effectively, the following steps are recommended:

  • Analyze Longest Transactions: Salesforce transactions vary in duration. To ensure comprehensive data capture, it's crucial to identify the longest transactions that may exceed the default consistency timeout.

  • Set Transaction Consistency Timeout: Adjust the "Transaction consistency timeout" value in your CDC configuration to accommodate the longest transactions. This adjustment ensures that CDC operations span the entire duration of transactions, preventing data gaps and missing records.

    To set the 
    Transaction Consistency Timeout:

    1. Open the Salesforce Source Endpoint
    2. Navigate to the Advanced tab 
    3. Locate Transaction consistency timeout (sec): and set the value accordingly

      transaction consistency timeout.png

 

By following these steps, you can enhance the accuracy and completeness of your data capture process when loading data from a Salesforce source to a Snowflake target. This proactive approach will help prevent future instances of missing records and maintain the integrity of your data integration pipeline.

Labels (1)
Version history
Last update:
‎2023-09-27 04:40 AM
Updated by: