We have a problem with our performance.
We have a datamodel with more than 600 million rows and our server (8 * 2,39 ghz CPU and 28 GB RAM) doesn't really like the reloads.
The CPU and RAM are at 100% when we reload.
We are thinking of a few options, but I also have some questions about those:
Thanks in advance,
When it comes to huge volume of data Qlik supports multiple options and it depends on your actual scenario to choose a specific action. You can go through the attached link - which details about the available options and it also gives an idea on when to choose a particular option.
I had personally used a similar version of On Demand App Generation when we had to deal with billions of rows of data.
Thanks & Regards,
in general if you split the app to small app, and assuming you won't load all of them in the same time
you'll be able to get better performance to end users and better reload time for the apps
as for filter across apps , you pass selection between apps , when you call the second app from the first
but it will repoen the target app everytime
Splitting into separate applications per customer will reduce the resource demand. If that is best achieved with on-demand or handled via tasks is hard to predict. Keep in mind that the two alternatives time their demand for resources differently.
One thing to keep in mind is to differentiate between reload- and usage-performance. These are quite different operations with different types of resource demands.
Reading a stream of data is generally not a very CPU demanding operation. More data will consume more RAM. Complex transformations will probably consume more of both.
Calculations try to leverage as much CPU as possible to complete the calculations as quickly as possible. RAM is used to cache the result of the calculation.
Application design can also force temporary tables to be created during calculation. These consumes RAM but this memory is returned once completed.
What fills the cup, the first or last drop?
With QlikView reloads and calculations was performed by different processes and therefor also in direct competition for resources. In Qlik Sense the reload is performed by the same process, QIX Engine, as the calculations. The competition for resources are handled within the product. This doesn't mean that there is no competition.
My approach is to try to assess the resource demand for each "task". How much resources are used when the server is serving this app to the users and then trying to isolate the reload to see how that consumes resources.
With this knowledge you can better plan your solution for how to address the main shortage.
The Scalability Center recommend using the Performance Triangle to map the situation. This includes the Environment, Application and Usage. There must be balance between the supply and demand for resources. Application design is important, understanding the usage pattern and ensuring that the environment is delivering according to its theoretical capacity is a good starting point. There are many documents on Qlik.com that describe performance and scalability.
If your published app is using a lot of CPU and RAM the effect of the reload will impact several things. It might force the QIX Engine to purge items from the cache to make room for the data being loaded from the data source. Loading data also consumes CPU but in general a "simple" load isn't primarily CPU-hungry.
Dividing large data sets into separate applications will improve the situation, unless there are many users who are supposed to analyse across all the data.
Establishing a separate reload-node might also be an option. This will reduce the competition.
Application design, both reload and visualisations, should be assessed to make sure they solve the task without wasting resources.
One option is to look at the hardware. Architecture and configurations matters. 28GB for 600M row application might be too small. Not enough RAM will force recalculations of things previously cached, that was purged. 8 cores is not a lot and if that's enough or not is hard to tell. It is proven that more cores will complete a task quicker, as long as processing capacity is the limiting factor and that the architecture is the same.
I would collect Performance Counters, using the template included in the Scalability Tools, for a while. This is valuable data that can aid in diagnosing the situation.
A well-based diagnosis will make it easier to prescribe the correct medicine.