I am wanting to make sure an environment is optimal for a new QlikView production environment. Any feedback it would be greatly appreciated.
It is a two node QlikView Server cluster. For now the second node is removed because we noticed some initial performance issues with the second node so we deactivated it for now while we try to validate the primary node. Two large servers are being used for the cluster. At this point we do not have a dedicated file share setup so we are using the first / primary node as the file share location. All of the services including QMS and a single publisher license has been established on this first node and any configurable path in QMC is pointing to the primary server using UNCs. The two servers are within the company's intranet. We have two smaller servers that will be used as external Web Servers that have been placed within the DMZ using QlikView Webserver. Ports have been opened for all the servers to talk with each other (at least all the ones that are needed). We are in the process of setting up SSL on the two web servers to do https.
Some initial oddities that have been noticed:
When coming from internet to the WebServers everything works as desired (alternate login page to access point) but when a document is selected the user is prompted again for credentials before showing / opening the document.
It is "appears" that when using the IE plugin by internal users to open a document that it is faster than through the ajax client. I believe the plugin won't work for internet clients because the port might still be blocked from direct access since QlikView server is internal and we are concerned with opening this to the world. (If not mistaken when using the plugin it bypasses the web server which is security concern.)
As mentioned earlier, a significant lag on the web experience when the second node of cluster was enabled especially when a user was assigned to it but even when not. Initially just shutdown the QVS service on the second node when trying to figure things out but that didn't help either. Had to physically remove the node from the cluster within QMC to get basic response time back up for things like logging in and hitting access point.
The environment was built under the assumption that web servers should be within DMZ and remaining QlikView servers should not be. Is this what is recommended and best practice? Any additional suggestions and thoughts would be greatly appreciated!
If the document prompts for credentials, it's likely because of the section access security, that is basically that even if you have filesystem permissions on the file, the QVW can store an additional security table so if you are not in there, you cannot logon. Usually, adding the account running the QlikView services in the form DOMAIN\USERNAME and your own accounts (IT Admins, QlikView Admins) to that table and reload and save the file should work.
If the Plugin finds the port 4747 TCP open, then it uses it, bypassing the webserver, talking directly to the QlikView Server, and hence being faster (not always, but it's possible). However, if the Plugin does not see the port 4747 open, it falls back to 80 TCP (standard HTTP) which in QlikView is called tunnelization. This should be noticeable slower, since in this case the Plugin does talk to the web server, then to the server, and all the information is encapsulated.
If you notice performance degradation just by adding a node to the cluster and applying the license:
Do both servers have the same hardware settings, number of CPUs, RAM speed, clock speed, amount of RAM, etc? (specifically related to CPU, RAM and chipset, NUMA, prefetching, node interleaving...)
How load is balanced between servers? There are cases where you can miss communication between the web server and QlikView server because there is a hardware or software load balancer randomly switching users from one server to the other, that makes the impression that the click never gets to the server, or once clicked, the reply never gets back to the user. So use sticky sessions, cookies, affinity or whatever your network equipment vendor recommends for keeping a user in a server once the session has been opened.
Are users in both nodes being affected by performance degradation? Or is just one of the nodes?
This is one of those cases where it's worth the time engaging with QlikView Consulting Services so they can check the configuration and the platform and help to to move on.
In regards to the document prompts for credentials. We are using DMS authorization and we do not currently have any section access security defined - so are you thinking we should at it or possible add the service account as a user of the document under authorization? Another interesting point is when the credentials dialog comes up after selecting a document you can click on cancel instead of providing credentials and it still opens the document. So it is more of a nuisance. It might be related to the custom user authentication we are currently using along with and an ODBC Directory Service using the same prefix to share/merge groups (we plan to replace custom users with SSO WebTickets in the near future so the custom user implementation is temporary).
In regards to the nodes for the cluster, the servers are identical and we were seeing the performance degradation just when trying to log into the Access Point and see documents. We are using just one of the web servers pointing to the QlikView server cluster for now. It would take 30 to 60 seconds to see a list of documents on Access Point while the second node was added to the cluster versus a couple seconds or less without the second node added. We tried each of the QlikView cluster load balancing settings. When interacting with a document that was assigned to the second node the performance was considerably slower than when it was assigned to the first node. I am wondering if it is related to the documents being shared from the first node so we are in process of getting external file server established and tested with the first node before adding the second node back. (Note we have the documents set to preload on the cluster.)
So any recommendations / best practices on the service positioning in and out of the DMZ for internet facing applications? (Note - the QlikView Server cluster is the only QlikView environment at this point for both internet and intranet users.)
As for the credentials, sounds like a bug we experienced some time ago but that I cannot recall the exact details right now. So I'd suggest you to log a case with Support using email@example.com just to confirm it and if there is a patch available.
As long as the services account is the one that will open the documents, reload them, save them and so, I'd strongly recommend to add it to the distribution, or make it Document Administrator and Supervisor in both QVS and QDS services in the Console. You can check that:
for the User folders in the QMC, System, Setup QlikView Servers, expand, click on yours, then on the right side Folder Access,
and for the Source folders in the QMC, System, Setup, Distribution Services, expand, click on yours, and on the right side General, Source Folders.
Although the QMC shows "Users and Groups" it only accept individual user accounts, not groups.
In regards to the degradation with the second node, try making some of the documents being preloaded. I understand that even if you are using a server1 folder, the server2 is using a UNC path and has all corresponding permissions. One more thing: does that happen when you add the same server, or it happens regardless the server you join? I mean, If server1 is up, then join server2, it slows down. But what if server2 is up and server1 joins? Again, I'd log a Support case here.
Hope that helps.
EDIT: I'm assumin that the account running the services is a domain account and the same for both servers. Should they are local users in the local Administrators group, then that might be the problem, since the cluster acts as one server with two different credentials, and you cannot add local users from one server to another.