Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
stevejoyce
Specialist II
Specialist II

Engine Performance Issue - Hard Max Limit

I've read through a bunch of articles but would appreciate if some people could give insight.

There is a clear performance issue currently (not always) on one my engine nodes.  It is seemingly right now at the "Max memory usage" threshold (90%).   Likely a multi part question, and I'm not really looking for the answer to be add more RAM.  if someone can give insights or comments to below... 

1st)  As I understand it, when I hit this threshold, I should expect performance issues?  As the system is now reading/writing page files.

2nd)  Is reaching the Max memory usage threshold avoidable?  Doesn't Qlik keep building up its cache/physical memory until this point?  So what else would prevent it from reaching this usage other then periodically restarting services.

 

Referenced articles:

https://community.qlik.com/t5/Knowledge/Qlik-Engine-Memory-Management/ta-p/1710559

https://community.qlik.com/t5/Knowledge/Qlik-Sense-Engine-How-the-memory-hard-max-limit-works/ta-p/1...

 

3 Solutions

Accepted Solutions
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

If you are regularly hitting the "Max memory" threshold, you are almost guaranteed to experience noticeably degraded performance.  

(BTW, You posted this in a QlikVIew forum but based on your mentioning engine node and "max memory" I'm assuming you are Qlik Sense.  QS and QV are pretty much the same, but the nomenclature is different.)

There are   "required" and "optional" memory objects in the Qlik Engine (QIX). 

Required are those things that must be in memory. That would be applications currently in use and session data for active users. 

Optional items would be cached results, session data for idle users and applications with no current users. 

QIX tries to keep it's memory allocation at or below the Min memory setting.  If the Min is exceeded QIX starts to unload optional items -- unused documents, oldest cached results.  It does this lazily 'cause what's the rush. 

If the demand for required items exceeds the Min threshold, QIX must honor those requests (they are required after all) and you will start see the usage climb beyond the Min.  QIX will start to aggressively purge older cache and timed out user sessions. 

When you reach the Max threshold you are in trouble. That means all the available RAM is filled by required items.  There may be little or no caching.  QIX may start timing out and removing user sessions early.  And worst of all, if you exceed the available RAM you will start paging and performance will be very poor.  QIX does not work well with paging, data has to be in RAM. 

There is another class of required memory allocation, that is the transient allocations made during chart calculations. Those can be rather large depending on the app but memory is released as soon as the calc completes. 

Using up to the Min and occasionally peaking above it is a good thing.  But being regularly above the Min results in QIX burning a lot of CPU cycles performing memory management tasks. 

Being at the max means QIX is in a bit of emergency mode trying to stay alive and reasonably performant. 

I like the garage analogy.  When your garage is 70% or less full, why bother spending time sorting through stuff. Just keep it all (even those ancient cache results) and go see a movie instead.  When it gets above 70% you start investing some time in sorting but it's no rush. You'll get to when you have the occasional free time.  When your garage creeps towards 90% full it becomes imperative you deal with it now. You will not be going to the movie even though the rest of the family is going.  Because if it gets full you will not be able to close the door and your stuff will get rained on.  Your spouse (users) will be very unhappy. 

I typically adjust the default 70% and 90% values on larger servers to reflect the absolute numbers I'm after.   You have to consider the installed RAM and what other software is on the box. If it's a dedicated. Qlik server you probably only need to reserve about 4-8 GB for windows. On a 256GB server that means your Max% can be like 97%.  Again for a 256GB example, setting a Min of 70% means you are allocating a 46GB buffer zone between the Min and the Max.  In this case I would typically raise it to value that created about a 15GB buffer. 

BTW, if you are doing reloads on this same box you don't want to push the max up to 97%.  You need to leave some room for reload tasks. 

I know where to find the right log messages and tuning knobs in QV, as that is where I've done most of my server work.  Perhaps those more familiar with QS can chime in with their own experience and some tool pointers. 

Things you can do to avoid hitting your memory max:

- Reduce the memory requirement per app using a tool like QSDA Pro

- Manually allocate apps to servers to balance the memory usage. 

- If the user session memory is significant and you have many users, load balance the app across multiple servers to distribute the user session load. 

- If you have a lot of users, consider timing your users out earlier. 

- And of course, add more RAM 😉

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

View solution in original post

Maria_Halley
Support
Support

@stevejoyce 

Just as a note on @rwunderlich great post.

Once the document is loaded it stays in memory until the document timeout has passed (default 8h). The document timeout does not start to count down until all sessions against it is closed.

Closing Apps is never a part of the "cleanup" triggered when the memory limits are reached.

We also cache calculations and session information.  This does not get cleared when the app gets unloaded but stays in memory until the working set limit is reached.

Then we start to clear it out, started with the data that was used the longest ago.

 

You can set up the system to clear the cache with a set interval. But this can cause a slower response time for the users. Also again it does not unload any apps, it just drops the session data, calculations etc. So they gain might not be very big.

View solution in original post

Maria_Halley
Support
Support

1.QlikSense only clear the cache when the working set limit are reached, it will not unload apps. So if there are more apps in memory than the server can handle then that can not be solved by the "clean up"

Also if user request an app that is not currently not in memory then the engine will try to load it to memory

If there were no users accessing any apps for more than 8h (assuming you have the default settings) The apps should be unloaded from memory. (each app has its own timer so all apps wont be closed at the same time)
If this doesn't happened then I recommend that you open a case on that, because that require more investigation than what can be done here

There is a setting to clear the whole cache with a certain interval. But this usually doesn't make that big of a difference. Since the apps are what is taking most of the memory and they wont be closed.

You can test this and see how much memory is released when emptying
https://community.qlik.com/t5/Knowledge/How-to-clear-the-cache-used-by-QlikView-Server-and-QlikSense...

2. Lowering the working set limit, will not help in this case. It could even make the user experience worse.

Optimizing the apps will help the overall memory consumption.

For example

If you have 5 apps that all take 10Gb in memory (not on disk). Users log in and opens sessions to all apps. Memory consumption will then be 50Gb. (plus what was already in memory)

In the same scenario as above, all the apps are all 5Gb in memory. Then when they are all loaded the memory consumption will be 25Gb

 

View solution in original post

8 Replies
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

If you are regularly hitting the "Max memory" threshold, you are almost guaranteed to experience noticeably degraded performance.  

(BTW, You posted this in a QlikVIew forum but based on your mentioning engine node and "max memory" I'm assuming you are Qlik Sense.  QS and QV are pretty much the same, but the nomenclature is different.)

There are   "required" and "optional" memory objects in the Qlik Engine (QIX). 

Required are those things that must be in memory. That would be applications currently in use and session data for active users. 

Optional items would be cached results, session data for idle users and applications with no current users. 

QIX tries to keep it's memory allocation at or below the Min memory setting.  If the Min is exceeded QIX starts to unload optional items -- unused documents, oldest cached results.  It does this lazily 'cause what's the rush. 

If the demand for required items exceeds the Min threshold, QIX must honor those requests (they are required after all) and you will start see the usage climb beyond the Min.  QIX will start to aggressively purge older cache and timed out user sessions. 

When you reach the Max threshold you are in trouble. That means all the available RAM is filled by required items.  There may be little or no caching.  QIX may start timing out and removing user sessions early.  And worst of all, if you exceed the available RAM you will start paging and performance will be very poor.  QIX does not work well with paging, data has to be in RAM. 

There is another class of required memory allocation, that is the transient allocations made during chart calculations. Those can be rather large depending on the app but memory is released as soon as the calc completes. 

Using up to the Min and occasionally peaking above it is a good thing.  But being regularly above the Min results in QIX burning a lot of CPU cycles performing memory management tasks. 

Being at the max means QIX is in a bit of emergency mode trying to stay alive and reasonably performant. 

I like the garage analogy.  When your garage is 70% or less full, why bother spending time sorting through stuff. Just keep it all (even those ancient cache results) and go see a movie instead.  When it gets above 70% you start investing some time in sorting but it's no rush. You'll get to when you have the occasional free time.  When your garage creeps towards 90% full it becomes imperative you deal with it now. You will not be going to the movie even though the rest of the family is going.  Because if it gets full you will not be able to close the door and your stuff will get rained on.  Your spouse (users) will be very unhappy. 

I typically adjust the default 70% and 90% values on larger servers to reflect the absolute numbers I'm after.   You have to consider the installed RAM and what other software is on the box. If it's a dedicated. Qlik server you probably only need to reserve about 4-8 GB for windows. On a 256GB server that means your Max% can be like 97%.  Again for a 256GB example, setting a Min of 70% means you are allocating a 46GB buffer zone between the Min and the Max.  In this case I would typically raise it to value that created about a 15GB buffer. 

BTW, if you are doing reloads on this same box you don't want to push the max up to 97%.  You need to leave some room for reload tasks. 

I know where to find the right log messages and tuning knobs in QV, as that is where I've done most of my server work.  Perhaps those more familiar with QS can chime in with their own experience and some tool pointers. 

Things you can do to avoid hitting your memory max:

- Reduce the memory requirement per app using a tool like QSDA Pro

- Manually allocate apps to servers to balance the memory usage. 

- If the user session memory is significant and you have many users, load balance the app across multiple servers to distribute the user session load. 

- If you have a lot of users, consider timing your users out earlier. 

- And of course, add more RAM 😉

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

Maria_Halley
Support
Support

@stevejoyce 

Just as a note on @rwunderlich great post.

Once the document is loaded it stays in memory until the document timeout has passed (default 8h). The document timeout does not start to count down until all sessions against it is closed.

Closing Apps is never a part of the "cleanup" triggered when the memory limits are reached.

We also cache calculations and session information.  This does not get cleared when the app gets unloaded but stays in memory until the working set limit is reached.

Then we start to clear it out, started with the data that was used the longest ago.

 

You can set up the system to clear the cache with a set interval. But this can cause a slower response time for the users. Also again it does not unload any apps, it just drops the session data, calculations etc. So they gain might not be very big.

stevejoyce
Specialist II
Specialist II
Author

@rwunderlich @Maria_Halley Thank you both for your detailed responses, I very much appreciate it.  Couple follow up questions for @rwunderlich .

1) Yes I am referring to QlikSense

2) Let's say I add another 100gb of RAM, wouldn't this only mean it would take longer to get to lower limit which is inevitable/by design to reach.  Is the problem more isolated to what's happening between lower limit & max limit? 

3) In theory, how can an engine hit the Max threshold?...  

say it's currently at its at working limit (lets say 70% of 100gb server = 70gb).  A bunch of requests come in, new users, opening of new large apps, heavy calculations, etc.

i) It sounds like IF new requests consumes RAM FASTER then Qlik can "lazily" unload optional memory, it can hit this peak and once it hits max threshold, we are in trouble?  Should Qlik correct itself (unloading enough memory to get back to lower limit eventually?  Or is it wedged and requires a service restart).

ii) Or if all usage going is ALL REQUIRED memory usage and is > max threshold limit.  

  -> I highly double this is the case.  It takes ~month to build up to the lower limit.

 

Should I monitor for maybe 90% of the Max Limit?  If that gets hit I know we are on the verge of trouble, and next off hours opportunity restart the engine service.

 

This is what RAM usage looked like, servers restarted 2021-07-26 back to 0 (before that stable at 80-90gb no issues going on.

After restart servers gradually build up RAM usage and sit at the low working limit.  On Sept 1 11:00am, it took a spike and never came down.  On Sept 2 11:00am, it took another spike and never came down (the huge spike to 0 was me restarting the engine service last night).  And I would bet those to 11:00am spikes is NPrinting jobs being kicked off.

stevejoyce_0-1630667198543.png

stevejoyce_1-1630667491927.png

 

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

@Maria_Halley 

"Closing Apps is never a part of the "cleanup" triggered when the memory limits are reached."

Can you clarify what do you mean by "closing".  Do you mean unloading from memory? Or closing user sessions?

-Rob

Maria_Halley
Support
Support

@rwunderlich 

I'm Sorry for being unclear, yes I mean unload form memory.

@stevejoyce 

1. In short terms QlikSense Engine will take up as much memory as it needs. But it doesn't need infinite memory.

It depends on the apps and the user count and system configuration.

As mentioned before. Apps that have not had any sessions opened against it for 8h (default) will be unloaded. That will free up memory. But if an app gets used again before the 8th hours the the timer restarts.

2.  Since the apps take up the most memory and we do not unload them from memory. (see above)

If you are already are on the threshold and a user opens an app that is currently not in memory, Qlik Engine will still try to load it into memory.

It is very important to make sure that the Apps are optimized for performance. You can actually create one small app that by itself bring down a server.

 

But other than the spike, I think the memory usage looks normal.

You mention Nprinting might have caused the spike and if reports were ran at that time that is very likely.

stevejoyce
Specialist II
Specialist II
Author

For context on the visuals, the total server RAM on 2 servers shown have a total of 128 gb memory set to 70%/90% (working limit, hard max) = 90gb working / 115gb hard max.

1) Sept-01, eng2 went to 100gb but never came down to the ~80-90gb state.  I can triple check, but during the flat line on Sept-01 20:00 - Sept02 5:00, there were no active sessions or apps open for 8+ hours -> Is there any other reason Qlik would not have cleared RAM to get back down to working limit?  This looks like the biggest indicator that server wasn't going to be able to handle the next day's workload.

 

2)  Sept-02 it increased another 10gb, at that point we saw slow performance... which is agreed this would be expected as we had reached the hard max.  -> I'm not sure if I left it until following day if it would have resolved itself.

I understand optimizing the apps and/or adding RAM, but I'm trying to understand how this doesn't only delay the same behavior instead of resolving.

To me it sounds like I need to be most concerned how the engine can handle activity between the working limit and hard max.  And I should increase the total RAM between those thresholds, whether that's lowering the working limit %, increasing total RAM on server, optimizing apps.

I understand lowering the working limit would sacrifice optional cached data, but that is secondary to not having the server crippled as it was when it hit the max hard limit.

 

Thank you, and again very much appreciate the detailed responses.

 

Maria_Halley
Support
Support

1.QlikSense only clear the cache when the working set limit are reached, it will not unload apps. So if there are more apps in memory than the server can handle then that can not be solved by the "clean up"

Also if user request an app that is not currently not in memory then the engine will try to load it to memory

If there were no users accessing any apps for more than 8h (assuming you have the default settings) The apps should be unloaded from memory. (each app has its own timer so all apps wont be closed at the same time)
If this doesn't happened then I recommend that you open a case on that, because that require more investigation than what can be done here

There is a setting to clear the whole cache with a certain interval. But this usually doesn't make that big of a difference. Since the apps are what is taking most of the memory and they wont be closed.

You can test this and see how much memory is released when emptying
https://community.qlik.com/t5/Knowledge/How-to-clear-the-cache-used-by-QlikView-Server-and-QlikSense...

2. Lowering the working set limit, will not help in this case. It could even make the user experience worse.

Optimizing the apps will help the overall memory consumption.

For example

If you have 5 apps that all take 10Gb in memory (not on disk). Users log in and opens sessions to all apps. Memory consumption will then be 50Gb. (plus what was already in memory)

In the same scenario as above, all the apps are all 5Gb in memory. Then when they are all loaded the memory consumption will be 25Gb

 

stevejoyce
Specialist II
Specialist II
Author

I still come back to thinking I need more of a range between working set limit and hard max limit. 

However it appears on this day, we had nightly publisher tasks running during business hours, later then usual because of a delay in upstream DW ETL.  Our engines are also our slave nodes.  Usually not a problem because we expect refreshes to complete before business starts, but there are cases where it can be delayed.  In this scenario we had heavy reloads combined with users accessing apps and it appears this pushed the server to its 90% max threshold.  Unfortunate the engine could not correct itself after a few hours, perhaps next time I will wait a full day and see if it can clear itself back to the working set limit.

Thank you for your input and details.