Skip to main content
Announcements
See why Qlik is a Leader in the 2024 Gartner® Magic Quadrant™ for Analytics & BI Platforms. Download Now
cancel
Showing results for 
Search instead for 
Did you mean: 
QGTFS
Contributor III
Contributor III

Reload Engine and Distribution service instability

Hello Qlik Community,
 
I ask for your help regarding an evasive issue, regarding the reload engine and distribution service instability, mostly regarding the distribution service.
 
The symptoms are the following, at seemingly random time of the day, we receive the following alert:
"QMC on machine XXXXX reports that one or more Qlikview Distribution Services failed to respond. Please check QMC for more info"
And sometimes, we also receive at the same time another alert:
"The service 'ReloadEngine@XXXXX ' is down. Service Url is http://XXXXX :4720/QDS/Service"
 
For the context:
We were using Qlikview 11.2 until recently, we upgraded to the version 12.9. Initial release, going to 12.7 first and then to 12.8. The problem appeared when we first upgraded to 12.7, and the situation changed with Qlikview 12.8 SR 2 ( 12.80.20200.0).
The first failed were correlated to an overload of the CPU during the reload of some reports. Since then, we've optimized the queries and overall it reduced the occurrence of the alert, although not completely avoiding it.
We also, when it first appeared, applied a daily restart of services, which helped but did not solve the problem. We later removed it with the latest version, as it seemed suboptimal in this new setup, which in turns reduced the frequency.
To get to this point, we followed the guides to help troubleshoot the problem: 
 
 
It seems that part of it is also related to the reload with Nprinting, triggering the issue.  
With the 12.8 SR2 version, one of the patch notes seemed to target a similar issue - QV-22417 - QVS crashed intermittently under heavy load and interaction:
"The QVS server crashed intermittently while evaluating QlikView chart object calculations. 
This was not caused by any specific calculation but was mainly connected to concurrency around shared chart objects, particularly sessions connecting/disconnecting and fast type changes of linked shared objects. 
Multiple additional synchronizations and safeguards have been introduced to safely handle sharing sessions attaching and detaching. 
Shared object with linked (replicated) object type changes have been restricted to safe combinations."
 
 
After the change from 12.8 IR to 12.8 SR2, the qlikview distribution service fails with the reload engine, changed a little in behaviour:
It now appears at seemingly random time, without regularity in the time of day, and also more often. Even though 
the specs of the server have been increased a little, because we also want to shift from using Qlikview Plugin to Ajax for the user to access the reports, and so to keep a margin of memory. 
The server is using Microsoft windows server 2019 Standard, the CPU is  2.30GHz, over 8 cores, with 72 Gb of memory.
This server is dedicated to Qlikview and currently still in test, meaning not accessed by end users, only admins.
We've also tried to cap CPU usage and uncap it (65 %, 95%, 100%) without changes.  Currently, it is capped at 95%.
We've analyzed the logs as well without finding clues or much trace.
Following the advice from this guide:
We've looked up the performance, although it only seems to point to the consequence, with a high CPU consumption after the service failure.
 
In 12.9, the issue still persists, even with little load on the server except Qlikview, Nprinting and me as the only user connected. 
 
Has anyone encountered a similar issue or has a clue of what would cause the distribution failure ? 
 
 

 

Labels (1)
10 Replies
QGTFS
Contributor III
Contributor III
Author

Sorry I spotted my mistake in the config file, with whitespace left, " GetQdsInfoTimeOutInMs ". I've updated it, and currently is testing.