Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Is it expected to see the same record in the output of both the Full Load (Loadxxxx.csv) and Change Table (date-time.csv) when an insert occurs during the Full Load process.
So the scenario is as follows:
Task is configured with both Full Load and Store Changes configured. Apply Changes is not enabled. The goal is to generate a baseline of the source table, then perform analytics against the
00:00 Task with Full Load and Store Changes is started.
00:15 Full Load is in Progress
00:16 Insert to table being loaded occurs.
The end result is that the new record which was inserted while Full Load was in process exists in both the Full Load files and the Change Table files. Is this expected and is there a way to prevent Store Changes from logging entries which were captured during the Full Load process?
Yes that's expected.
The full-load is effectively a simple SELECT * FROM <table> ORDER BY PK
Whether or not that selects sees a row inserted after the start depends on the isolation level and specifically 'phantom reads' which depends on the source database types and settings.
(see for example https://en.wikipedia.org/wiki/Isolation_(database_systems) or https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/transaction-isolation-levels?view=sq... or )
Replicate starts CDC for a table before starting the source unload and it 'caches' those changed to be applied after unload and its target load finish. (btw - these steps can clearly be seen in the reptask log. Increase logging level for SOURCE_UNLOAD and TARGET_LOAD to TRACE as desired )
When a task uses APPLY CHANGES it will apply those cached changes (which happened from committed transactions during the select) after the target load completes. If an insert in this context creates a duplicate error it is ignored. After the table load is completed a duplicate insert would trigger an 'apply error' and is handled according to task, or default global, error handling rules.
When building change tables (CT), that duplicate handling is note (can not) be done.
Hein
Yes that's expected.
The full-load is effectively a simple SELECT * FROM <table> ORDER BY PK
Whether or not that selects sees a row inserted after the start depends on the isolation level and specifically 'phantom reads' which depends on the source database types and settings.
(see for example https://en.wikipedia.org/wiki/Isolation_(database_systems) or https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/transaction-isolation-levels?view=sq... or )
Replicate starts CDC for a table before starting the source unload and it 'caches' those changed to be applied after unload and its target load finish. (btw - these steps can clearly be seen in the reptask log. Increase logging level for SOURCE_UNLOAD and TARGET_LOAD to TRACE as desired )
When a task uses APPLY CHANGES it will apply those cached changes (which happened from committed transactions during the select) after the target load completes. If an insert in this context creates a duplicate error it is ignored. After the table load is completed a duplicate insert would trigger an 'apply error' and is handled according to task, or default global, error handling rules.
When building change tables (CT), that duplicate handling is note (can not) be done.
Hein