Skip to main content
Announcements
Global Transformation Awards submissions are open! SUBMIT YOUR STORY
cancel
Showing results for 
Search instead for 
Did you mean: 
Freeze
Contributor II

Estimated Task Stopping Time

Question:

Is it possible to estimate the time required for a specific, running, task to be stopped?

 

Justification:

Performing a server configuration change can sometimes require a task(s) to be stopped and started as a pre and post step.

These changes are often time sensitive if they're being performed in a productions environments and it would be beneficial to know how long the process might require before beginning.  I.e. allowing stakeholders to be informed and manage their expectations for the data lag they'll experience.

Labels (4)
2 Solutions

Accepted Solutions
lyka
Support

Hello,

 

The current behavior of stopping task is as follow: the process will wait for open transaction completion. If they are not completed until timeout of a few minutes - the task will be killed

 

With Replicate it will wait for 30 minutes after stopping the task before the timeout happens.

 

 

Thanks

Lyka

View solution in original post

lyka
Support

Yes you are correct.

We wrote an article related to this

https://community.qlik.com/t5/Knowledge/Qlik-Replicate-Task-Stop-timeout-occurred/ta-p/1894021

 

Hope this helps!

Thanks

Lyka

View solution in original post

5 Replies
lyka
Support

Hello,

 

The current behavior of stopping task is as follow: the process will wait for open transaction completion. If they are not completed until timeout of a few minutes - the task will be killed

 

With Replicate it will wait for 30 minutes after stopping the task before the timeout happens.

 

 

Thanks

Lyka

Freeze
Contributor II
Author

So there is no way to estimate the exact stopping time, but all task will stop within 30 minutes?

lyka
Support

Yes you are correct.

We wrote an article related to this

https://community.qlik.com/t5/Knowledge/Qlik-Replicate-Task-Stop-timeout-occurred/ta-p/1894021

 

Hope this helps!

Thanks

Lyka

Heinvandenheuvel
Specialist III

[note: as with many topics in this QEM sub-forum this is really a Replicate question, QEM is only the messenger but does not have the ultimate control]

The Replicate task stop time is normally not a problem with Transactional apply, but can be a problem for the (recommended) batch apply mode. If a batch of changes is actively applying, then Replicate only acts on the stop request when a batch is finished. This is highly deployment specific. For many a typical batch finishes in less than 30 seconds. For some deployments, with very large batches or frequent errors during the bulks, it may take more than 30 minute.

Your best bet is to run critical tasks with TARGET_APPLY set to TRACE during a known to be busy period. After and hour or so, roll the log and analyze. It may be as simple a counting the number for 'bulk finished' messages in the log and calculate the average time. Others may want to know min, max and averages possibly with number op changes processed. You can figure this out just a text editor, or 'grep' for occasional use. For repeated evaluation you may want to script this with Python, Powershell, Awk or any script language of your choice really. Attached a PERL script (-h for help, -b for batch summary, -v for batch details (per tables). ).

Waiting is great and 'nice', but alternatively you might just want to 'kill' (OS command) the Replicate tasks to be  restarted immediately or say after 2 minutes. Typically it will recover just fine with no data loss on restart.

The (re)START times depends A) on the number of tables, and B) on the time to re-establish the position in the transaction log. For SQL server T-log backups that can take several minutes as those are read and evaluated sequentially.

 

Heinvandenheuvel
Specialist III

>> The current behavior of stopping task is as follow: the process will wait for open transaction completion. If they are not completed until timeout of a few minutes - the task will be killed

IMHO this is not clear enough and possibly just wrong.

What is meant by "wait for open transaction completion."? I am interpreting it as a source side change seen but no commit seen. Is that the suggestion? In that case it is incorrect. I just verified. Started a an update (Oracle source), did not commit, saw the 'incoming change waiting for source commit', stopped the task which it did in seconds with that open transaction.

If that transaction completion was intended to be about the target, well that's also not correct in bulk mode as it try to finish each table in the active bulk and each table is in its own transaction.

There is a concept of  "wait for open transaction completion." on task START. It has a user configurable timeout. Replicate waits to be sure that for changes it sees as it starts it has also seen a transaction started, making sure all transaction which were open when is started have committed. It want to capture all, or nothing, not half the changes for a transaction.

Please clarify and/or verify.

Thanks,

Hein.