NPRINTING ENGINE CORES ARE ASSIGNED TO ALREADY ABORTED TASKS
Having the following set up:
- Nprinting June 2019 with two engines. Windows virtual Server 2012 R2 CPU E5-2643v2 3.5 GHz and 128 GB of RAM each one.
One of the server has NPrinting Server and Engine installed and the other has just the additional engine.
- Qlikview April 2019.
Using local connections (not sure if this issue happens with qvp, since these are much more slower), we have seen the following behaviour using this console: https://servername:4993/#/admin/plan
Sometimes, when one or more local connections are running a task, the task get stucked. Normally when is about to finish. After a while, we decide to abort the task son we can try again and continue with another one.
However, as seen in the attached snapshots, the aborted task keep having nprinting cores/resources assigned and in consecuence other tasks can't use that resources. making the enviroment unstable and much morre slower. The only way to release those resources is restarting NPrinting Scheduler Services. But obviously this is not a solution, since this issue is happening basically every day.
Please note that:
- qvw associated to this task meet all requirements and recommendations provided by qlik help (no triggers, no one value selected, minimized objects, etc). Size goes from 10 MB to 200 MB.
- Reports asociated to these task have a high number levels and cycling. Timing of succesfully execution goes from 10 to 60 minuts depending on the specific report.
We believe that high usage of CPU during report execution is impacting Scheduler service, but cant say for sure.
Best Regards, Ruggero --------------------------------------------- When applicable please mark the appropriate replies as CORRECT. This will help community members and Qlik Employees know which discussions have already been addressed and have a possible known solution. Please mark threads with a LIKE if the provided solution is helpful to the problem, but does not necessarily solve the indicated problem. You can mark multiple threads with LIKEs if you feel additional info is useful to others.
First of all, thank you all for all the suggestions and good advices on this topic. Everything was very helpfull to improve stability and performance in our reports.
However, unfortunatelly, R&D team has identify 3 issues/bugs affecting us (classified as OP-8457, OP-8394 and OP-8871). Let me paste R&D comments below:
OP-8457 (resolved on coming NPrinting November 2019) -> RabbitMQ requests got lost because the engine was unable to send them back to the Scheduler (for some reason RabbitMQ broker closed the channel, hence the error you see in RabbitMQ logs).
OP-8394 (I guess related to previous OP-8457) -> The fact that the 4 lost requests remained there even after an abort is known to me but is part of OP-8394 improvement that has not been planned yet. In any case essentially this does not produce any practical issue for the customer, just a theoretical very small performance degrade, probably difficult to measure, due to sub-optimal scheduler allocations (the scheduler thinks that there are four requests to resolve on that connection and so once in while, depending on starvation mechanisms, allocate one or more resolvers for them).
OP-8871 -> qv.exe gets stuck sometimes. workaround is a script that could start every 20-30 minutes to clean up potentially stuck qv.exe
Let me clarify that (after many performance improvements) currently these issues only affect to certain documents.
If you guys have any additonal input on these JIRA issues, please let me know.