Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi, all!
In Replicate, when a full load is run, is it an upsert task (like a full load in Compose)?
I am trying to make sense of records in a table which I know were written to the source on a certain date (in this case, 2 November 2023) because the table itself has a last modified timestamp, but which have timestamps in the __ar archive table of a later date (in this case, 7 January 2024).
I am hoping to leverage the detailed write log of the archive table, but I want to be able to trust my understanding of the timestamps.
Any ideas?
Hi @JacobTews
The task logs should be written in the time zone of the Replicate server. The timestamp you see in Replicate history could be in the time zone of the browser/UI or the Replicate server - not sure at the moment as I'm testing with a browser open directly on my server.
I did notice however that doing a regular resume from where the task left off shows "TASK FULL LOAD STARTED" which is not correct.
You might try checking the task logs from when the task was started to confirm, but this will only tell you if all tables were reloaded, not if one was reloaded during CDC processing. On about the 7th line down show:
Task 'XYZ' running full load and CDC in resume mode
if your task was resumed. Otherwise, it will show:
Task 'XYZ' running full load and CDC in fresh start mode
This means all tables were reloaded.
Third option is a fresh start of CDC by timestamp (no reload):
Task 'XYZ' running full load and CDC in fresh start mode, starting from log position: 'timestamp:2022-03-01T10:15:00' (UTC)
Hope this helps.
Dana
Hi @JacobTews
A full load in Replicate by default drops the target table if it exists, creates it, then selects all rows from the source and performs inserts into the target table. This is for the base tables. Could the table you are looking at have been reloaded on Jan 7?
Very possible. I've dug through the logs to see if that was the case, but I get a little lost trying to figure out which things are in UTC and which are server time.
According to the monitor section of the GUI, the last actual full load finished Nov 3 (see image full_load.png), but according to the "History" (from the big arrow drop-down), every time the task gets restarted after an error it runs a full load (see image replicate_history.png). Maybe that's a setting I need to tweak?
Hi @JacobTews
The task logs should be written in the time zone of the Replicate server. The timestamp you see in Replicate history could be in the time zone of the browser/UI or the Replicate server - not sure at the moment as I'm testing with a browser open directly on my server.
I did notice however that doing a regular resume from where the task left off shows "TASK FULL LOAD STARTED" which is not correct.
You might try checking the task logs from when the task was started to confirm, but this will only tell you if all tables were reloaded, not if one was reloaded during CDC processing. On about the 7th line down show:
Task 'XYZ' running full load and CDC in resume mode
if your task was resumed. Otherwise, it will show:
Task 'XYZ' running full load and CDC in fresh start mode
This means all tables were reloaded.
Third option is a fresh start of CDC by timestamp (no reload):
Task 'XYZ' running full load and CDC in fresh start mode, starting from log position: 'timestamp:2022-03-01T10:15:00' (UTC)
Hope this helps.
Dana
Hello Team,
To add more on the Tse comments. When you are analyzing the logs. you can identify the difference between Full loads and Dummy full loads. full loads happens when you specially reload the target. dummy loads happened when you start the task from certain timestamp. it will re-fresh the metadata of participating tables.
Load finished for table 31780 rows received. 0 rows skipped. Volume transferred 98747360.
in case of dummy load.
Load finished for table 31780 rows received. 0 rows skipped. Volume transferred 0.
Hope this helps,
Regards,
Sushil Kumar
Thanks, @Dana_Baldwin! I did find that line in the log, and verified that it ran in Resume Mode. Is that the "dummy full load" that @SushilKumar refers to below?
Perhaps this is actually a Compose question, as Compose is the one which writes the __ar tables...
@JacobTews if it ran in resume mode and the task is set to do both full load & CDC, then a full load did not occur.
Excellent, thank you for the clarification!
Hello team,
If our response has been helpful, please consider clicking "Accept as Solution". This will assist other users in easily finding the answer.
Regards,
Sushil Kumar