Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
This page describes the settings for best performance for servers running the Qlik Associative Engine.
Latest update: May 2022
Windows 2022 has improved performance for servers with many physical cores. This table shows the definition used in the below document.
| Older Windows versions | Windows 2022 |
Server with normal core count | ≤64 physical cores | ≤90 physical cores |
Server with large core count | >64 physical cores | >90 physical cores |
Setting | Value |
Hyper-threading | Applies to QlikView and Qlik Sense servers:
There are use cases that even on servers with huge #cores enabling hyper-threading is beneficial. Therefore, it is best to test these settings for your application. |
Power Management (System Profile Settings) | Applies to QlikView and Qlik Sense servers:
Another setting that can be used is the full performance setting. But this settings makes the server run constantly at the maximum clock speed for all cores, which has the following drawbacks:
A solution to this is to use a custom system profile in the server BIOS that allows the CPUs to use their C states while all other components are set to full performance. The custom system profile should be set up similar to the following:
|
NUMA | QlikView servers (Intel):
Qlik Sense servers (Intel):
*On servers with Intel CPUs, NUMA is disabled by enabling Node Interleaving. QlikView and Qlik Sense servers (AMD EPYC):
|
Memory configuration | QlikView and Qlik Sense servers:
|
Hardware/Software Prefetcher | QlikView and Qlik Sense servers:
|
The names of the settings and how to tune them may differ depending on the server manufacturer and model. Refer to the documentation for your server to find the equivalents of the settings listed above.
Setting | Value |
Power plan | QlikView and Qlik Sense servers:
|
Registry update | Qlik Sense servers only: For servers with a large core count, there is a registry change, applicable to both Intel and AMD CPUs, that improves the responsiveness when the Qlik Sense Repository Service (QRS) is under heavy load (for example, when many users open the hub at the same time). Two registry updates are needed: Add the Thread_NormalizeSpinWait key as a DWORD value to the following subkey: HKEY\LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework
Add the Switch.System.Threading.UseNetCoreTimer key as a String value to the following subkey: HKEY\LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\AppContext
The fix is described in full here: https://support.microsoft.com/en-za/help/4527212/long-spin-wait-loops-in-net-framework-on-intel-skyl... |
/ Cheers from the Scalability Center team
Both HT and NUMA disabled. NUMA was disabled on all tests I ran.
Did you see great improvement?
How did you launch 40 qvw at the same time? by QV desktop maybe?
because there's a limit of 9 concurrent?
After changing the BIOS settings, is there anything else to be done before making QV functioning?
Thanks.
To run 40 concurrent processes you need a few things:
I ran the concurrent tasks by writing a script to generate the 40 clones of a test document, create QDS tasks for them, and schedule them to run concurrently every x minutes.
It seem weird to disable multithreading and NUMA in order to improve performance is there a logical explanation for this??
Sure it is a logical explanation. Further into in the Technical Brief on Overview on QlikView Scalability and Performance you can download from www.qlik.com
QlikView is at its very core a high CPU and RAM memory application, so the larger and more powerful CPUs and the more and faster RAM the system has, the better QlikView will perform.
NUMA basically makes each core access its memory, and only in case this core runs out of memory and needs more, have to ask to another core. In previous hardware configuration, where there was only one socket (one CPU in the motherboard), it was easy to access another core's memory if any, because they were hardware wired. In other words, NUMA is OK when you are using a physical hardware to virtualize or store files because it allows to better link vCPUs to physical CPUs and manage resources, but it's not OK when you need a high demanding application, as QlikView is, that will use as many RAM as you provide it.
Moreover, now with 4 and 8 socket computers, cores are no longer connected to each other, rather than using hemispheres so all communications have to pass through the bus from one hemisphere to another, therefore generating bottlenecks. In addition, the action of jumping from one core's memory to another's takes many CPU cycles hence downgrading performance. Actually, it has been proven that the larger number of cores you have, the poorer performance you will get.
What disabling NUMA does is to make all memory available, regardless the number of cores and the amount of RAM installed onto the system, so QlikView can benefit from all memory without jumping from one core to the other and avoiding those bottlenecks.
Note that not all processors allow disabling NUMA.
Similar with hyperthreading: it makes the hardware to take extra cycles of CPU to send instructions to one or the other thread that each core allows. However, there are tests in the documentation I mention where it has been proved that 2 socket CPUs with hyperthreading have a better throughput.
Kind regards,
Miguel
Miguel
Wow !!!
Many thanks for supplying that explanation. I had always been confused as to why disabling NUMA is recommended for a QlikView Server and now I know as your explanation makes perfect sense.
Many Thanks, Bill
Thanks Miguel Great explanation!!!
Great answer Miguel. There is also a technical brief for public consumption here:
Mike T
I have an HP server that has two options that are confusing me: node interleaving(Enable/Disable) and Numa grouping(Clustered/Flat). It was disabled and clustered respectively. In qmc I could only see 1/4 of the available cores. When our server team changed NUMA grouping to flat I was able to see all my cores, but I'm wondering if node interleaving should still be changed to enabled.
Hi Jonathan,
The two options are indeed a bit confusing. Most servers only have one option to enable/disable Node Interleaving. And when Node Interleaving is enabled, the processor will not be grouped by NUMA node.
Apparently in the recent HPE server, HPE has split this up in two options.
So in your case it is recommended you put Node Interleaving to enabled (NUMA disabled) and NUMA grouping to Flat (so that the cores are not grouped according to their NUMA node)
By setting Node Interleaving to enabled you will make sure that QlikView will spread the data evenly over all memory available on all sockets. The SW is normally able to detect NUMA, but it is still better to disable it.
Best regards,
Frederic