Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
So we have an ETL Process that once the data comes in from our source into snowflake that process transforms the data. We are using the replication date to know when we receive new data. One issue we have run into is when we have to reload a table because of an error or whatever reason. It then updates all of the replication dates to when the table was reloaded. The problem is that those records already exist downstream and our process wants to reprocess all of those records. I am trying to find out if there is any flag or message I can grab that would tell the downstream process that this table was just reloaded.
Hello,
You can play around with this and maybe it will help.
I created a global transformation and added a new column for reload_date.
CASE
WHEN $AR_H_STREAM_POSITION = ''
then datetime('NOW','localtime')
end
The field is only populated when a full load is triggered. I tested with 2 tables. On the initial load, both tables (table a and b) have a value of "2022-08-05 10:28:17.000000" when i reloaded table a, this now has a value of "2022-08-05 10:29:34.000000" and table b retains the original value however on CDC this column will be set to NULL
I will run additional tests - this is what i have for now 🙂
Thanks
Lyka
Thanks @pkagel , please mark it as accepted solution based on your testing results.
Thank you,