Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
There are several articles in Community about how to setup load balancing for a Qlik Sense Enterprise on Windows cluster, (e.g. https://community.qlik.com/t5/Official-Support-Articles/Load-Balancing-in-Qlik-Sense-Enterprise-on-W...). They focus on the setup of the nodes in Qlik cluster itself, but not how the upstream hardware load balancer (e.g. Citrix Netscaler) can decide in a failover situation where to route a https/wss client to.
The normal setup in a simple two-nodes environment is like this, although there are more Windows Services involved on each node, but to understand we only look at Proxy and Engine.
The hardware loadbalancer has to make sure that users do not need to know the machine names (DNS host names) of the nodes themselves, but only something generic such as qlik.company.com ...
The hardware loadbalancer, despite of its name, does not load balance between the engines - and it shouldn't. Such rules are setup on the proxy itself, and any proxy can route to any engine (decide where to open a given app based on many possible configurations).
But if the central node goes down, the hardware load balancer needs to re-route the traffic to the remaining node. The only thing the users will see is a short session disconnect, if their session was to an app on the lost node.
We want to avoid again, that the client needs to know the hostname of the backup node, but continues to work on qlik.company.com
What does the load-balancer need to check to decide if the central node is alive or not? There can be enterprise solutions that check Windows Services, but it can be simply a REST call to this endpoint:
/engine/healthcheck
But wait a minute, if you try to reach this endpoint, which gives a nice report about qlik's engine health in JSON format, you'd need to be logged in first. And even then, how to make sure, that the healthcheck you want to reach, is not load-balanced to the backup node and returns a misleading health status?
By configuring new virtual proxies, one for each node. The essential part of the configuration is:
Central Node for example
Backup Node (in my example is called Scheduler) likewise
Now you can call two individual endpoints like this:
Put the name of the virtual proxy between your hostname (... .com/) and /engine/healthcheck.
Each one will, without authentication, report the health state of the respective node with more information (e.g. saturated: true/false) that can be of use.
Great ! (as usual) thanks !