Because you are only serving a single document, it should be relatively simple to investigate the RAM usage.
What is the RAM usage when the document initially loads?
How quickly does the RAM ramp up? Does it constantly run at 70GB?
How are users connecting to the document? IE-Plugin, AJAX?
How often is the document reloaded? Do you have the QVS configured for only a single copy of the document?
Have you run the document analyzer on the document? It can be found at http://robwunderlich.com/downloads/
Is Collaboration turned on? Have you investigated the Collaboration objects using the power tools? http://community.qlikview.com/docs/DOC-3059
How often do you restart you Server or QlikView Server service? We do a daily server restart to eliminate any long-term memory leak buildup.
The RAM usage is primarily based on the volume of unique data in the document, but we have some charts that are using calculations that can cause 10 - 20 GB spikes in usage.
Thanks so much again bnichol.
I have details of my testing in the attachment of the other reply here. To clarify a few things you asked about.. using QV 10SR2 with IE plugin, reloaded once each day, only one copy allowed in memory. With regard to collaboration, we only use server bookmarks (so far).
As you've suggested, I think the memory leak/creep is the main issue here... we don't currently do a scheudled restart of the server (or services), so I'll explore that next.
It's unlikely to be a memory leak, but more likely just cache. My undertanding is that cache is not released when a document is unloaded.
However, cache should be trimmed when you hit the working set low threshhold, so you should not see paging in the scenario you've described, but rather cache trimming. You are on an older SR. I would recommend updating to the latest (SR5) and retesting your model.
Thank you for the feedback, Rob.
Indeed we're looking at upgrading our environment. I'm curious your opinion on the option of 10 SR5 vs. 11 SR1. I like many of the new features in version 11, but we've held on upgrading our environment just because we're a bit conservative. In our environment, upgrading production environments requires a bit of a process, so we don't typically take every SR or patch of an application unless there is a compelling reason to do so.
So given we are cautious, and considering stability and maturity of 11 SR1, would you recommend we move to 11 SR1, or instead stay with 10 and bump to SR5?
A few notes here. That 10% is an average, and depends largely on how the document is developed, number of rows, length of values, data model schema, level of granularity and distinctness of data, the number of charts, concurrency, cached selections and so.
Besides, all documents are in memory until its timeout happens, so you may find that a document that is not being in use by anybody is still loaded, which makes sense. If you are preloading, that uses RAM as well. And likewise happens with cached selections: the more objects you have the higher amount of RAM you will use until the working set is reached and the server starts to free memory to cache new queries. If you are using section access and reduction, that takes some RAM as well...
In addition, DMS authorization stores info in the .Shared and .Meta files, as well as the documents set to collaboration, notes, shared objects and bookmarks...
In this sense, concurrency may "happen" even when users are not logged in, because their copy of the document is still in memory, depending on the timeouts.
To make a more accurate approach I'd measure RAM usage in each of the following steps (you may add as many additional steps as you want to make the review more accurate yet):
- Make sure the preload option is set to "never" in all documents (while testing)
- For testing purposes, set all documents timeouts to a very low value (2 o 3 mins)
- Reboot the computer and log on, and check
- Open QlikView Desktop, open the QVW file, and check.
- Close QlikView Desktop
- Start QlikView Services and check
- Restore preloading settings in all documents if any and check
- Make one user log on and loading a document using Ajax and check
- (Same with IE Plugin if users will use it)
- Make the second user (the first is being there) log in and open a document and check
- Log off both users, let the document timeout and check
From this point on, users should add the same amount of memory in average. Now start doing clicks: CPU will go up while computing, but RAM will go up slowly as well as it's caching each document selection. You should see that when the timeout is reached, some RAM is free.
Hope that makes some sense and the tests go fine. Let us know anyway.
Thanks so much, bnichol and Miguel. Very helpful information. I wasn't aware of Rob's Document Analyzer, and I find that a very useful tool.
I think I'm seeing the very high memory utilization for the QVS.exe server on the server simply due to memory creep over time, no doubt attributable to a leak as the document is loaded/unloaded and many users open and close the document over a long period. I'll have to look into automating a regular restart (or at the least, stopping and starting the service).
I ran a number of test and recorded memory usage for the Server service (QVS.exe) to confirm. I've attached results for review by others, if interested.
Details are in the attached workbook, but to summarize:
Using QV10 SR2 on a server (Windows Server 2008 R2 SP1).
Dashboard is a 70Mb document, approx. 1.5 million records in fact table
For each round of testing, I stopped and re-started the service to begin with a clean slate (on startup the service uses < 8 Mb of RAM).
For single user testing, I ran three types of test, each with multiple iterations to record the memory after each. For multi-user testing, I ran with three users, each performing identical functions. In both single and multi-user testing, I tested with just opening the document, then closing. Then I tested by opening and viewing a handful of sheets. Finally, I stress-tested by opening the document, visiting all sheets and viewing all charts (~125 charts across 18 sheets).
As you can see form the results in the attached workbook, memory required seems to expand more or less as expected with additional users. However, memory is not released after each document close (after timeout), so over time this results in more and more additional RAM allocation. Once the working set Low Threshold is exceeded, this results in swapping the memory to disk, which of course will degrade performance.
This testing has helped me tweak our capacity planning model (at least for this particular dashboard). I've been testing on our TEST server, so I will look into scheduled restarts of the server (or services) in our PROD environment, then monitor as we move forward and see how it goes.
RAM Usage Analysis.xlsx 15.1 K