- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Estimated Task Stopping Time
Question:
Is it possible to estimate the time required for a specific, running, task to be stopped?
Justification:
Performing a server configuration change can sometimes require a task(s) to be stopped and started as a pre and post step.
These changes are often time sensitive if they're being performed in a productions environments and it would be beneficial to know how long the process might require before beginning. I.e. allowing stakeholders to be informed and manage their expectations for the data lag they'll experience.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
The current behavior of stopping task is as follow: the process will wait for open transaction completion. If they are not completed until timeout of a few minutes - the task will be killed
With Replicate it will wait for 30 minutes after stopping the task before the timeout happens.
Thanks
Lyka
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes you are correct.
We wrote an article related to this
https://community.qlik.com/t5/Knowledge/Qlik-Replicate-Task-Stop-timeout-occurred/ta-p/1894021
Hope this helps!
Thanks
Lyka
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
The current behavior of stopping task is as follow: the process will wait for open transaction completion. If they are not completed until timeout of a few minutes - the task will be killed
With Replicate it will wait for 30 minutes after stopping the task before the timeout happens.
Thanks
Lyka
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So there is no way to estimate the exact stopping time, but all task will stop within 30 minutes?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes you are correct.
We wrote an article related to this
https://community.qlik.com/t5/Knowledge/Qlik-Replicate-Task-Stop-timeout-occurred/ta-p/1894021
Hope this helps!
Thanks
Lyka
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[note: as with many topics in this QEM sub-forum this is really a Replicate question, QEM is only the messenger but does not have the ultimate control]
The Replicate task stop time is normally not a problem with Transactional apply, but can be a problem for the (recommended) batch apply mode. If a batch of changes is actively applying, then Replicate only acts on the stop request when a batch is finished. This is highly deployment specific. For many a typical batch finishes in less than 30 seconds. For some deployments, with very large batches or frequent errors during the bulks, it may take more than 30 minute.
Your best bet is to run critical tasks with TARGET_APPLY set to TRACE during a known to be busy period. After and hour or so, roll the log and analyze. It may be as simple a counting the number for 'bulk finished' messages in the log and calculate the average time. Others may want to know min, max and averages possibly with number op changes processed. You can figure this out just a text editor, or 'grep' for occasional use. For repeated evaluation you may want to script this with Python, Powershell, Awk or any script language of your choice really. Attached a PERL script (-h for help, -b for batch summary, -v for batch details (per tables). ).
Waiting is great and 'nice', but alternatively you might just want to 'kill' (OS command) the Replicate tasks to be restarted immediately or say after 2 minutes. Typically it will recover just fine with no data loss on restart.
The (re)START times depends A) on the number of tables, and B) on the time to re-establish the position in the transaction log. For SQL server T-log backups that can take several minutes as those are read and evaluated sequentially.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> The current behavior of stopping task is as follow: the process will wait for open transaction completion. If they are not completed until timeout of a few minutes - the task will be killed
IMHO this is not clear enough and possibly just wrong.
What is meant by "wait for open transaction completion."? I am interpreting it as a source side change seen but no commit seen. Is that the suggestion? In that case it is incorrect. I just verified. Started a an update (Oracle source), did not commit, saw the 'incoming change waiting for source commit', stopped the task which it did in seconds with that open transaction.
If that transaction completion was intended to be about the target, well that's also not correct in bulk mode as it try to finish each table in the active bulk and each table is in its own transaction.
There is a concept of "wait for open transaction completion." on task START. It has a user configurable timeout. Replicate waits to be sure that for changes it sees as it starts it has also seen a transaction started, making sure all transaction which were open when is started have committed. It want to capture all, or nothing, not half the changes for a transaction.
Please clarify and/or verify.
Thanks,
Hein.