Solved: Re: Need to understand the load type in qlik repli... - Qlik Community

deepaksahirwar · ‎2024-06-10

Dear Community,

I am seeking your assistance to understand how to identify from the logs whether a task in replication has started with a full reload or only with a timestamp.

I initiated one of my tasks with a date-timestamp, but it seems to be behaving similarly to a full reload. For instance, the modified dates for the full load table have been updated to the current date. Moreover, the metadata refresh took an unusually long time, more than 3 hours, for the tables, even though the data transfer count was zero.

I am sharing the logs here for your reference. I would appreciate it if you could help me understand this scenario.

Thank you in advance.

Sampe logs for your reference.

00001036: 2024-06-10T19:42:25 [SOURCE_CAPTURE  ]I:  Additional properties = 'QUERYTIMEOUT=0;QUERYTIMEOUTINTERVAL=0;EXECUTETIMEOUT=3600;'  (db2luw_endpoint_imp.c:618)
00001036: 2024-06-10T19:42:25 [SOURCE_CAPTURE  ]I:  'Dynamic buffer size' feature is enabled = '0'  (db2luw_endpoint_imp.c:620)

00001036: 2024-06-10T19:42:26 [SOURCE_CAPTURE  ]I:  Source endpoint 'DB2 LUW' is using provider syntax '<default>'  (provider_syntax_manager.c:945)
00001036: 2024-06-10T19:42:26 [METADATA_MANAGE ]I:  The SAP Server is UNICODE  (sapdb_endpoint_imp.c:199)
00001036: 2024-06-10T19:42:26 [METADATA_MANAGE ]I:  Going to connect to server  database PROD_STAGING  (cloud_imp.c:4131)
00001036: 2024-06-10T19:42:26 [METADATA_MANAGE ]I:  Target endpoint 'SnowflakeAzure' is using provider syntax 'SnowflakeAzure'  (provider_syntax_manager.c:951)
00001036: 2024-06-10T19:42:27 [METADATA_MANAGE ]I:  ODBC driver version: '3.0.2'  (ar_odbc_conn.c:633)
00001036: 2024-06-10T19:42:28 [METADATA_MANAGE ]I:  Connected to server  database PROD_STAGING successfully.  (cloud_imp.c:4170)
00001036: 2024-06-10T19:42:28 [TASK_MANAGER    ]I:  Creating threads for all components  (replicationtask.c:2401)
00001036: 2024-06-10T19:43:51 [TASK_MANAGER    ]I:  Threads for all components were created  (replicationtask.c:2560)
00001036: 2024-06-10T19:43:51 [TASK_MANAGER    ]I:  Task initialization completed successfully  (replicationtask.c:3921)
00018580: 2024-06-10T19:43:51 [STREAM_COMPONEN ]I:  Timeline is 20240610184915668740  (ar_cdc_channel.c:503)
00011516: 2024-06-10T19:43:51 [STREAM_COMPONEN ]I:  Timeline is 20240610184915668740  (ar_cdc_channel.c:503)
00016568: 2024-06-10T19:43:51 [STREAM_COMPONEN ]I:  Timeline is 20240610184915668740  (ar_cdc_channel.c:503)
00011516: 2024-06-10T19:43:51 [TARGET_APPLY    ]I:  Going to connect to server  database PROD_STAGING  (cloud_imp.c:4131)
00011516: 2024-06-10T19:43:51 [TARGET_APPLY    ]I:  Target endpoint 'SnowflakeAzure' is using provider syntax 'SnowflakeAzure'  (provider_syntax_manager.c:951)
00011516: 2024-06-10T19:43:51 [TARGET_APPLY    ]I:  ODBC driver version: '3.0.2'  (ar_odbc_conn.c:633)
00011516: 2024-06-10T19:43:52 [TARGET_APPLY    ]I:  Connected to server  database PROD_STAGING successfully.  (cloud_imp.c:4170)
00011516: 2024-06-10T19:43:52 [TARGET_APPLY    ]I:  Restore bulk state. Last bulk last record id - '0', last applied record id - '0', target confirmed record id - '0', sorter confirmed record id - '0'  (endpointshell.c:2016)
00011516: 2024-06-10T19:43:52 [TARGET_APPLY    ]I:  Set Bulk Timeout      = 300000 milliseconds  (bulk_apply.c:563)
00011516: 2024-06-10T19:43:52 [TARGET_APPLY    ]I:  Set Bulk Timeout Min  = 30000 milliseconds  (bulk_apply.c:564)
00011516: 2024-06-10T19:43:52 [TARGET_APPLY    ]I:  Parallel bulk apply enabled, maximum number of threads is 15  (bulk_apply.c:723)
00011516: 2024-06-10T19:43:52 [TARGET_APPLY    ]I:  Working in bulk apply mode  (endpointshell.c:2024)
00016568: 2024-06-10T19:43:52 [SORTER          ]I:  Start the task using saved state. Start source from stream position timestamp:2024-06-10T16:23:43 and id 1. Confirmed target id is 0, next target id is 0, last source timestamp 0  (sorter.c:532)
00016568: 2024-06-10T19:43:52 [SORTER          ]I:  'Stop reading when memory limit reached' is set to true  (sorter.c:660)
00001036: 2024-06-10T19:43:52 [PERFORMANCE     ]T:  Source latency 0.00 seconds, Target latency 0.00 seconds, Handling latency 0.00 seconds  (replicationtask.c:3788)
00001036: 2024-06-10T19:43:52 [TASK_MANAGER    ]I:  All stream components were initialized  (replicationtask.c:3694)
00018580: 2024-06-10T19:43:52 [SOURCE_CAPTURE  ]I:  Starting processing from context 'timestamp:2024-06-10T16:23:43'  (ar_cdc_channel.c:838)
00011516: 2024-06-10T19:43:52 [TARGET_APPLY    ]I:  Bulk max file size: 2000 MB, 2048000 KB  (cloud_bulk.c:170)
00016568: 2024-06-10T19:43:52 [SORTER          ]I:  Sorter last run state: confirmed_record_id = 0, confirmed_stream_position = ''  (sorter_transaction.c:3372)
00016568: 2024-06-10T19:43:52 [SORTER          ]I:  Correcting source database time by 0 microseconds  (sorter_transaction.c:197)
00016568: 2024-06-10T19:43:52 [SORTER          ]I:  Transaction consistency reached  (sorter_transaction.c:326)
00001036: 2024-06-10T19:43:53 [TASK_MANAGER    ]I:  Starting subtask #1  (replicationtask_util.c:517)
00004640: 2024-06-10T19:43:53 [STREAM_COMPONEN ]I:  Timeline is 20240610184915668740  (ar_cdc_channel.c:503)
00007768: 2024-06-10T19:43:53 [STREAM_COMPONEN ]I:  Timeline is 20240610184915668740  (ar_cdc_channel.c:503)
00001036: 2024-06-10T19:43:53 [TASK_MANAGER    ]I:  Starting subtask #2  (replicationtask_util.c:517)
00011736: 2024-06-10T19:43:53 [STREAM_COMPONEN ]I:  Timeline is 20240610184915668740  (ar_cdc_channel.c:503)
00019348: 2024-06-10T19:43:53 [STREAM_COMPONEN ]I:  Timeline is 20240610184915668740  (ar_cdc_channel.c:503)
00004640: 2024-06-10T19:43:53 [SOURCE_UNLOAD   ]I:  ODBC driver version: '11.05.0700'  (ar_odbc_conn.c:633)
00001036: 2024-06-10T19:43:53 [TASK_MANAGER    ]I:  Starting subtask #3  (replicationtask_util.c:517)
00012440: 2024-06-10T19:43:53 [STREAM_COMPONEN ]I:  Timeline is 20240610184915668740  (ar_cdc_channel.c:503)
00012264: 2024-06-10T19:43:53 [STREAM_COMPONEN ]I:  Timeline is 20240610184915668740  (ar_cdc_channel.c:503)
00004640: 2024-06-10T19:43:53 [SOURCE_UNLOAD   ]I:  ODBC driver version: '11.05.0700'  (ar_odbc_conn.c:633)
00004640: 2024-06-10T19:43:53 [SOURCE_UNLOAD   ]I:  Source endpoint 'DB2 LUW' is using provider syntax '<default>'  (provider_syntax_manager.c:945)
c00016568: 2024-06-10T19:43:59 [SORTER          ]I:  Start collecting changes for table id = 43  (sorter_transaction.c:2409)
00001036: 2024-06-10T19:43:59 [TASK_MANAGER    ]I:  Start loading table 'TRANSACTION - OTHERS'.'EKKN' (Id = 43) by subtask 1. Start load timestamp 00061A8CB19F1380  (replicationtask_util.c:755)
00012264: 2024-06-10T19:43:59 [TARGET_LOAD     ]I:  Connected to server  database PROD_STAGING successfully.  (cloud_imp.c:4170)
00011736: 2024-06-10T19:44:00 [SOURCE_UNLOAD   ]I:  resolve_table_orig_db: table_def orig_db_id of table 'SAPSR3.AFPO' = 002E0009  (db2luw_endpoint_metadata.c:266)
00011736: 2024-06-10T19:44:00 [SOURCE_UNLOAD   ]I:  Unload finished for table 'TRANSACTION - OTHERS'.'AFPO' (Id = 36). 0 rows sent.  (streamcomponent.c:3784)
00011736: 2024-06-10T19:44:00 [SOURCE_UNLOAD   ]I:  Unload finished for table 'SAPSR3'.'AFPO' (Id = 124). 0 rows sent.  (streamcomponent.c:3784)
00016568: 2024-06-10T19:44:02 [SORTER          ]I:  Start collecting changes for table id = 44  (sorter_transaction.c:2409)
00001036: 2024-06-10T19:44:02 [TASK_MANAGER    ]I:  Start loading table 'TRANSACTION - OTHERS'.'EKKO' (Id = 44) by subtask 3. Start load timestamp 00061A8CB1CCDA40  (replicationtask_util.c:755)

Dana_Baldwin · ‎2024-06-10

Hi @deepaksahirwar

A task with full load and apply changes enable will have this log entry when you reload:

Task 'Task Name' running full load and CDC in fresh start mode

When you start by timestamp, the log entry will be:

Task 'Task Name' running full load and CDC in fresh start mode, starting from log position: 'timestamp:2022-03-01T10:15:00' (UTC)

When starting from timestamp, a false full load occurs to perform the metadata check you mentioned. The time needed for that to occur can be impacted by the source & target database performance and how current the table statistics are for traditional RDBMS endpoints.

Thanks,

Dana

View solution in original post

Dana_Baldwin · ‎2024-06-10

Hi @deepaksahirwar

A task with full load and apply changes enable will have this log entry when you reload:

Task 'Task Name' running full load and CDC in fresh start mode

When you start by timestamp, the log entry will be:

Task 'Task Name' running full load and CDC in fresh start mode, starting from log position: 'timestamp:2022-03-01T10:15:00' (UTC)

When starting from timestamp, a false full load occurs to perform the metadata check you mentioned. The time needed for that to occur can be impacted by the source & target database performance and how current the table statistics are for traditional RDBMS endpoints.

Thanks,

Dana

Heinvandenheuvel · ‎2024-06-10

The included log lines start too late. Look in the first 20 lines or so for the "[TASK_MANAGER ]I: Task" line

That's the critical. there are other hints like from you sample text:

[SORTER          ]I: Start the task using saved state. Start source from stream position timestamp:2024-06-10T16:23:43

This suggests a 'resume' (from saved state)

Not there are other indications in your log fragment that full-load is happening:

[TASK_MANAGER    ]I:  Start loading table 'TRANSACTION - OTHERS'.'EKKN' (Id = 43) by subtask 1. Start load timestamp

This could be because a the task was not done full-loading or perhaps there where table metadata changes on the source, or transformations/manipulations changes the task.

Hein.

Kent_Feng · ‎2024-06-10

Hi @deepaksahirwar

Check your task log and look for the first TASK_MANAGER record, usually placed between line 8 and line 12, you will tell if the task is reloaded, restarted from timestamp/stream position, resumed, resumed after a crash:

The task is reloaded or started for the first time:

[TASK_MANAGER    ]I:  Task 'Task_Name' running full load and CDC in fresh start mode  (replicationtask.c:1818)

The task is restarted from a stream position:

[TASK_MANAGER    ]I:  Task 'Task_Name' running full load and CDC in fresh start mode, starting from log position: 'position:246009759683'.  (replicationtask.c:1818)

The task is restarted from a timestamp:

[TASK_MANAGER    ]I:  Task 'Task_Name' running CDC only in fresh start mode, starting from log position: 'timestamp:2024-05-25T08:00:00' (UTC).  (replicationtask.c:1818)

The task is resumed, this task must be a CDC only or a Full Load + CDC task.

[TASK_MANAGER    ]I:  Task 'Task_Name' running full load and CDC in resume mode  (replicationtask.c:1770)

the task automatically restarted after a recoverable error (a task crash), from the number after # you will see how many times it has restarted:

[TASK_MANAGER    ]I:  Task 'Task_Name' running full load and CDC in resume mode after recoverable error, retry #3  (replicationtask.c:1816)

When you restart task from timestamp or stream position, you will also see SOURCE_UNLOAD and TARGET_LOAD information in your task log, because Replicate needs to refresh the metadata. It might take a while and the actual time depends on your environment, table quantity, bandwidth, etc. If it takes a long time for a metadata refresh, like several hours as you mentioned, please open a support ticket and let our TSE to investigate for you.

*** Greetings from Down Under ***

deepaksahirwar · ‎2024-06-17

Hi @Dana_Baldwin , @Heinvandenheuvel & @Kent_Feng ,

Thank you for your ongoing support.

I have a question regarding timestamp reload. When we perform a timestamp reload, the modified date of tables gets updated. For instance, consider the table ‘XYZ,’ which had a full reload datetime of 10-06-2024 at 06:00 PM.

If I reload the task using timestamp load, the modified date refreshes to the latest one, while the data remains unchanged.

I understand that timestamp reload primarily affects metadata information for the table. However, my concern is that it removes all the past reload date history for that table.

My questions are:

Q1. In what situations should we perform timestamp reload for a task?
Q2. What are the benefits of using timestamp reload, and when is it most useful?

Thank you for your insights!

Best Regards,

Deepak

Heinvandenheuvel · ‎2024-06-17

>> I understand that timestamp reload primarily affects metadata information for the table. However, my concern is that it removes all the past reload date history for that table.

The 'timestamp' start specifically mentions 'tables already loaded." That's the purpose and requirement. It will specifically NOT remove that data. No worries.

>>Myquestions are:

>> Q1. In what situations should we perform timestamp reload for a task?

When the tables are already loaded ! But not by Replicate in a task with CDC activated.

And example you be that one used a full DB restore to a target, or even a disk clone, and then needed to 'catch up' with the changes since the backup / clone was made.

>> Q2. What are the benefits of using timestamp reload, and when is it most useful?

The benefit is that it does not reload the tables. Duh! That mean billions of innocent bits saved. It can mean hours or even days of work.

It can also be useful to fill an operational 'gap' some mistake or error which happened to disturb the regular CDC process. Normally Replicate will recover itself from its saves context, even after a Replicate server crash or shutdown, but sometimes that does not work. You can then attempt to 'fix' that by restarting with a timestamp just before the issue happened.

Hein

vinayakct · ‎2024-07-01

Hi @Heinvandenheuvel ,

As I read from your last response, in which you mentioned

Normally Replicate will recover itself from its saves context, even after a Replicate server crash or shutdown, but sometimes that does not work. You can then attempt to 'fix' that by restarting with a timestamp just before the issue happened.

Can you share more details on this, how does Qlik Replicate handles server crash/ sudden shutdowns, if you are restarting the tasks with a timestamp earlier, then how do we handle duplicates / missed data updates?

Can you possibly share some in depth understanding on error handling and crashes for this so that I can understand things better.

Heinvandenheuvel · ‎2024-07-01

>> Can you share more details on this, how does Qlik Replicate handles server crash/ sudden shutdowns,

Replicate uses frequently updated status files to remember where it was, and what data it has in the sorter area and what is applied. It's those file like: C:\Replicate\data\tasks\Oracle_to_Oracle\StateManager\ars_saved_state_000001.sts

>> if you are restarting the tasks with a timestamp earlier, then how do we handle duplicates / missed data updates?

You don't. The biggest risk is to start too late. Start too early, thus some overlap, creates issues but typically the data is fine. Yeah, an insert will be repeated, and fail with duplicate key but as long as you understand why, then why not?! An update will be repeated - so what? A delete will be repeated and fail with missing row - but that's ok is it not? So you may want to just ignore the errors/warnings for short period of time, or you might want to change the error handling policy just for the start by timestamp.

Perhaps temporarily change the task to 'update record if exists on insert' - 'insert record if does not exist on update' and 'ignore delete for non-existing rows'. Start your task by time-stamp and when it catched up after minutes, or perhaps hours, stop it, restor old error handling and resume. Done.

If still worried - put the task on you weekend or monthly schedule for full reload.

Hein.

Need to understand the load type in qlik replicate

CDC - 1-many - Log Stream

Configuration

Connectivity - Sources or Targets

Errors - Unexpected Behavior

Functionality

General Question