Discussion board for collaboration on Qlik Replicate.
We are trying to reload one of our largest tables since 11am on Friday Jan 6 - time of this posting is Sunday Jan 8.
We have defined 11 segments.
PROBLEM: The task keeps restarting - and therefore the reload of this large table as it's not yet had time to complete. Time between restarts has been from about 1 hour to almost 16 hours. The table would likely load in about 17 hours with all segments running.
This message is typically in the log right before the task ends but has not been there every time:
00013944: 2023-01-07T02:48:50 [SOURCE_CAPTURE ]I: External stop signalled (db2luw_endpoint_capture.c:3460)
00013944: 2023-01-07T02:48:50 [SOURCE_CAPTURE ]I: External stop signalled (db2luw_endpoint_capture.c:3468)
Does this message indicate that the source disconnected the session ?
Thanks,
Martin
Hello @Martin11 ,
Agree @SwathiPulagam , besides that I'd like to add:
1- Please make sure if any defined scheduler, scripts, or manual operations to try to stop the task
2- Check if the task 'crash', you may check the later/newer one task log file (if the task restarted automatically), in general if there is a line in the beginning of the new file, looks like:
[TASK_MANAGER ]I: Task 'xxxxxx' running full load and CDC in resume mode after recoverable error, retry #1 |
The "retry #n" means the task ever crashed. "N" is the retry number.
3- if the task got error or carshed, maybe there are some clues in Windows Event Viewer, or DB2 log file
4- Try to turn off "Apply changes" and "Store changes" in task setting but keep Full Load ON only , run the task again to see if the error caused by connection issue (timeout, broken, firewall kills a 'too long' connection etc), or only relevant to CDC component "SOURCE_CAPTURE".
5- Turn on TRACE at components of SOURCE_CAPTURE/SOURCE_UNLOAD to get more information. Please turn them on just before the error time, try to avoid too big task log file generated.
Hops this helps.
Regards,
John.
Hi @Martin11 ,
Please check with your DBA team did someone manually killed the sessions.
Thanks,
Swathi
Hello @Martin11 ,
Agree @SwathiPulagam , besides that I'd like to add:
1- Please make sure if any defined scheduler, scripts, or manual operations to try to stop the task
2- Check if the task 'crash', you may check the later/newer one task log file (if the task restarted automatically), in general if there is a line in the beginning of the new file, looks like:
[TASK_MANAGER ]I: Task 'xxxxxx' running full load and CDC in resume mode after recoverable error, retry #1 |
The "retry #n" means the task ever crashed. "N" is the retry number.
3- if the task got error or carshed, maybe there are some clues in Windows Event Viewer, or DB2 log file
4- Try to turn off "Apply changes" and "Store changes" in task setting but keep Full Load ON only , run the task again to see if the error caused by connection issue (timeout, broken, firewall kills a 'too long' connection etc), or only relevant to CDC component "SOURCE_CAPTURE".
5- Turn on TRACE at components of SOURCE_CAPTURE/SOURCE_UNLOAD to get more information. Please turn them on just before the error time, try to avoid too big task log file generated.
Hops this helps.
Regards,
John.
Update:
We reduced the number of parallel load segments on this large table from 11 to 4 and the table loaded ! Albeit it took 44 hours.
We've loaded another large table with 10 segments with no problem. Could there be something in the data that causes the issue (constant restarts of the load after 1.5 - 14 hours) ?
Martin