4 Replies Latest reply: Nov 8, 2016 10:50 AM by Isabelle Timmermans RSS

    Server performance - Lots of data

    Isabelle Timmermans

      Hi,

       

      We have a problem with our performance.

      We have a datamodel with more than 600 million rows and our server (8 * 2,39 ghz CPU and 28 GB RAM) doesn't really like the reloads.

      The CPU and RAM are at 100% when we reload.

       

      We are thinking of a few options, but I also have some questions about those:

      • We have several clients in our app, maybe we can split it to an app per client. The total amount of data will stay the same, but we split the data over several apps and reload tasks. Will this make reloads and server performance better?
      • Or we want to split the several data sources over a few apps, so one app only contains 1 datasource and the amount of rows per app will decrease. Total amount of data will stay the same, but split over apps. We use Qlik Sense in a mash-up, Is there a possibility to use filters over several apps? (like: If i select October in App1, is there a way to "transfer" the filter to app2?)
      • Are there other options to help my server with the performance?

       

      Thanks in advance,

      Kind Regards,

      Isabelle

        • Re: Server performance - Lots of data
          Raajesh Nagarajan

          Hi Isabelle,

           

          When it comes to huge volume of data Qlik supports multiple options and it depends on your actual scenario to choose a specific action. You can go through the attached link - which details about the available options and it also gives an idea on when to choose a particular option.

           

          On Demand App Generation

           

          I had personally used a similar version of On Demand App Generation when we had to deal with billions of rows of data.

           

          Thanks & Regards,


          Raajesh N

          • Re: Server performance - Lots of data
            liron baram

            hi

            in general if you split the app to small app, and assuming you won't load all of them in the same time

            you'll be able to get better performance to end users and better reload time for the apps

            as for filter across apps , you pass selection between apps , when you call the second app from the first

            but it will repoen the target app everytime

            • Re: Server performance - Lots of data
              Lars Skage

              Hi, Isabelle.

               

              Splitting into separate applications per customer will reduce the resource demand. If that is best achieved with on-demand or handled via tasks is hard to predict. Keep in mind that the two alternatives time their demand for resources differently.

               

              One thing to keep in mind is to differentiate between reload- and usage-performance. These are quite different operations with different types of resource demands.

               

              Reading a stream of data is generally not a very CPU demanding operation. More data will consume more RAM. Complex transformations will probably consume more of both.

              Calculations try to leverage as much CPU as possible to complete the calculations as quickly as possible. RAM is used to cache the result of the calculation.

              Application design can also force temporary tables to be created during calculation. These consumes RAM but this memory is returned once completed.

              What fills the cup, the first or last drop?

               

              With QlikView reloads and calculations was performed by different processes and therefor also in direct competition for resources. In Qlik Sense the reload is performed by the same process, QIX Engine, as the calculations. The competition for resources are handled within the product. This doesn't mean that there is no competition.

               

              My approach is to try to assess the resource demand for each "task". How much resources are used when the server is serving this app to the users and then trying to isolate the reload to see how that consumes resources.

              With this knowledge you can better plan your solution for how to address the main shortage.

               

              The Scalability Center recommend using the Performance Triangle to map the situation. This includes the Environment, Application and Usage. There must be balance between the supply and demand for resources. Application design is important, understanding the usage pattern and ensuring that the environment is delivering according to its theoretical capacity is a good starting point. There are many documents on Qlik.com that describe performance and scalability.

               

              If your published app is using a lot of CPU and RAM the effect of the reload will impact several things. It might force the QIX Engine to purge items from the cache to make room for the data being loaded from the data source. Loading data also consumes CPU but in general a "simple" load isn't primarily CPU-hungry.

               

              Dividing large data sets into separate applications will improve the situation, unless there are many users who are supposed to analyse across all the data.

              Establishing a separate reload-node might also be an option. This will reduce the competition.

               

              Application design, both reload and visualisations, should be assessed to make sure they solve the task without wasting resources.

              One option is to look at the hardware. Architecture and configurations matters. 28GB for 600M row application might be too small. Not enough RAM will force recalculations of things previously cached, that was purged. 8 cores is not a lot and if that's enough or not is hard to tell. It is proven that more cores will complete a task quicker, as long as processing capacity is the limiting factor and that the architecture is the same. 

              I would collect Performance Counters, using the template included in the Scalability Tools, for a while. This is valuable data that can aid in diagnosing the situation.

              A well-based diagnosis will make it easier to prescribe the correct medicine.

               

              Regards

              Lars

              • Re: Server performance - Lots of data
                Isabelle Timmermans

                Thanks for the suggestions.

                I'm gonna check them an see if they are applicable for our situation.