36 Replies Latest reply: Jun 25, 2015 8:37 AM by Boris Tyukin RSS

    please help build a new server - HP DL380 or DL580

    Boris Tyukin

      Hi guys,

      I am looking for a second opinion as we are already working with our vendors and Qlik reps. We just purchased a new server and I found out that it performs terribly with QV even though it is 4 times more expensive and powerful. We are going to return this server to our vendor (HP) and get something different hence I am asking for an opinion on an ideal and fastest for QV HP server.

      Right now my company uses a single server to run all QV services, including QVS, QDS and AccessPoint. It is very fast but we are running out of RAM and it is not expandable so we are in the process of getting a second server for QV which will only run AccessPoint and QVS.

      Our old server:

      2 socket HP DL380P GEN8

      Intel Xeon CPU E5-2690 2.90GHz

      256Gb of RAM

      Our new server (which we are going to return):

      4 socket HP DL580 Gen8

      Intel Xeon CPU E7-4870 v2 2.30GHz

      1Tb of RAM

      I was super excited about DL580 and when it arrived I started running some tests. We have a fairly complex dashboard with 90+ million rows and a lot of extensive calculations and I found out that our old server was 2-5 times faster on the most of the tabs. We tuned the server using QV tech. paper specifically made sure that max. energy profile is used, hyper threading is off and NUMA is off. Turbo Boost is on.

      It helped a little bit especially then we disabled NUMA – I am confused why though because according to QV, NUMA is not a problem on 11.2 but apparently it is. So after we disabled NUMA, things improved maybe by 10-20% but it was still a way slower than our old server.

      We reached to HP and one of their senior architects explained that DL580 has to do more work with RAM because of the way RAM is shared between processors and the clock speed is 1333 vs. 1666 on DL380.

      At that point I downloaded and ran 4 different benchmarking tools (MAXXMEM2,NOVABENCH, PassMark and SiSoft Sandra) and all of them showed 2-6 times difference in RAM tests – DL580 was slower again!

      Our goals for a new server:

      1. 1) Should be faster than the old one
      2. 2) Should be expandable to 1Tb of RAM at least
      3. 3) Should support at least 50 concurrent users
      4. 4) Should be made by HP – rack server HP Proliant family.
      5. 5) Our budget is 50k
      6. 6) The server will run ONLY QVS and AccessPoint.

      I apologize for a long introduction to my question but now I am puzzled what HP server we should pick since DL580 with 4 processors clearly does not meet our needs.

      I am not a hardware expert and was relying on our hardware people and Qlik but apparently the configuration they picked did not meet our goals. They are working again to revise the config but I wanted to get a second opinion from a forum and you.

       

      I am thinking now to either get the fastest E5 or E7 and this time only two sockets to minimize memory hops. Also I wanted to see if we can use higher clocked RAM (1887?)

      It is going to be a rack server from HP Proliant family – you can actually build it online here

      http://www8.hp.com/us/en/products/proliant-servers/index.html?facet=ProLiant-DL-Rack

       

      Any suggestions are highly appreciated!

        • Re: please help build a new server - HP DL380 or DL580
          Sagar Mehta

          Hi Borys,

           

          We are also using the same set of servers and we are also facing the same issue with HP Proliant GL580 gen7. One thing which improved the server response is by reducing the number of CPU's used by the Qlikview server from 64 CPU's to just 32 CPU's.

          The perfromance has improved a bit. You can try that in your system and provide your feedback.


          Thanks,

          Sagar

          • Re: please help build a new server - HP DL380 or DL580
            Sagar Mehta

            We were able to improve the QV publisher performance by updating the wondows 2008 server bios as well. check out below details, hopefully that might also help.

             

            What           -->       Upgrade System BIOS

            Faulty BIOS -->    P65 07/01/2013

            Fix               -->   Apply SP63928.exe to upgrade to P65 10/01/2013

            More Info     -->    P Support Center (http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken&javax.portlet.prp_bd9b6997fbc7fc515f4cf4626f5c8d01=wsrp-navigationalState%3Didx%253D2%257CswItem%253DMTX_63d3fca65f874a63a0adebc3bc%257CswEnvOID%253D54%257CitemLocale%253D%257CswLang%253D%257Cmode%253D4%257Caction%253DdriverDocument&javax.portlet.tpst=bd9b6997fbc7fc515f4cf4626f5c8d01&sp4ts.oid=4142792&ac.admitted=1406015906171.876444892.199480143)

            • Re: please help build a new server - HP DL380 or DL580
              Boris Tyukin

              I am glad we are not the only ones facing this issue - thanks for all your replies and I feel better now

               

              I already reviewed and applied settings recommended as per Scalability Lab document. Disabling NUMA helped a little bit and I noticed maybe 10-20% improvement. As for other settings like disabling hyperthreading and disabling prefetch - I do not think it actually did anything.

               

              The BIOS is fresh - our server is literally straight out of factory. Even CPUs are from the most recent batch so everything is fresh

               

              I did play with CPUAffinity for AccessPoint and disabled last half of CPU cores and rebooted after that - it actually did not help at all. Our HP architect (super knowledgeable guy and he works for HP) recommended to remove 2 CPUs - but for me it does not make sense because each processor cost us like 6k.

               

              I think we are moving towards exchanging the server on a 2 socket one and probably getting the fastest Xeon E7 v2

              . The problem there though that most 2 socket configurations can only access 512GB of RAM or 768GB at most (but slower DIMMs).

               

              But all in all it is a big disappointment for us so far - we spent 4x times more money on this server than our our old one cost us and it is literally x4 times slower than the old one which is ironic but sad matter.

               

              I wonder now if something can be improved in QV engine itself by Qlik - 4 and 8 socket machines are pretty cheap these days and we will see soon even more powerful servers so I believe it is something that Qlik needs to look at and address in future.

                • Re: please help build a new server - HP DL380 or DL580
                  Egor Kobylkin

                  Borys, do you mind posting your data schema here? May be a few expressions too.

                  There are many things that can be done to improve the performance of a Qlikivew app. May be the collective mind will be able to suggest a few things that might help in your case.

                  90 million rows is minuscule :-) For instance I have over 7 billion rows (but not that many users).

                    • Re: please help build a new server - HP DL380 or DL580
                      Boris Tyukin

                      Hi KlickiBunti, thanks for your offer but we have many dashboards and all of them work much slower. The one I was testing with 90MM rows is the most complicated though and it took 2 years to develop it. A lot of financial calculations and metrics with a Health care twist so I cannot possible explain all of it. Please be assured though most of the best practices are already in place.

                       

                      the problem is not the number of rows - I have 1 billion rows app which is blazing fast but complexity of calculations is really low. If all you do is straight sums - it is one thing, if you have to do nested aggrs and dynamic set expressions - totally different. We already spent a good deal of time optimizing it but it is super complex and our users require raw level of details which does not help of course.

                        • Re: please help build a new server - HP DL380 or DL580
                          Egor Kobylkin

                          A few thoughts (and I may have made some implicit assumptions - so correct me where I am wrong) :

                          The 4-socket architecture should only be slower if it has to access the memory in an inefficient manner.

                           

                          So I wonder why would the memory architecture play a role if your dataset is rather small row-wise? They should all fit nicely into one NUMA node (256GB), shoudn't they. You want to have a 1TB of RAM - how come 90million rows amount to such a big dataset?

                           

                          If these are rather the user caches that are not managed correctly - how about testing that? How about creating 10 documents with the same content, but partitioning your user base to access them in cohorts.

                           

                          Now if 90 million rows are big because they contain many columns - may be you could split it into a star schema with multiple tables? So the columns that are used for calculation are separated from text fields etc. So the calculations will not have to deal with those columns thus reducing inefficient data transfer.

                           

                          Another thing is you might want to check is whether you could exchange the data size for the calculation complexity. May be you could pre-calculate a few things or create specialized tables. In other words how about denormalizing your dataset to speed up calculations. If you only have 90 Million rows - extend it into a dataset that has 9 Billion rows but has more or less anything you need to calculate your stuff in a more straightforward way.

                           

                          This might be totally irrelevant to your situation but I just want to throw that at you. Just for the sake of hearing your reaction if nothing else.

                            • Re: please help build a new server - HP DL380 or DL580
                              Boris Tyukin

                              sorry if I was not clear, but we have more than one app and the app I am testing does not take 1Tb of RAM. That app takes about 4-5Gb of space compressed and 20Gb uncompressed in RAM. I also tested a few others apps - they all were very slow.

                               

                              I understand about app optimization but at this point I am just trying to make the new server work at least with the same performance. That app I am testing is very complex and we already spent a lot of time optimizing it using different techniques. I already preload everything I can but a lot of metrics are not additive and cannot be calculated in a script unfortunately.

                      • Re: please help build a new server - HP DL380 or DL580
                        Sagar Mehta

                        Hi Scalability Team,

                         

                        We have done the tweaks which are mentioned in above threads. We also raised support request 00330211 - Qlikview compatibilty issue with HP DL580 Server. Team says that they cannot help on this scalability team could help to resolve the performance with HP DL580 servers.

                         

                        Could somebody suggest what should be the next steps taken or the server is incompatible with Qlikview.

                         

                        Thanks,

                        Sagar

                          • Re: please help build a new server - HP DL380 or DL580
                            Boris Tyukin

                            Hi Sagar,

                             

                            I worked with QV on this and went through their benchmarks. They analyzed results and came back with the bunch of charts and confirmed that our DL580 performs similarly to other DL580s they tested.

                             

                            They told us that DL380 is a much faster server but not that capable if you have more than 100s of _concurrent_ users  and this there DL580 will shine. They told us DL380 is like a Ferrari and DL580 is like a school bus - first can only ride two people and very fast, while school bus can ride 100 people but slow.

                             

                            I really wish we knew this before we've made the purchase especially because Qlik was involved in this deal and architect from Qlik suggested this server to us since we asked specifically for the best hardware configuration. They also knew the complexity of our app and our business case.

                             

                            So now Qlik suggested to get DL380 but the money was spent and looks like we are stuck with super powerful 4 times more expensive server which is 4 times slowed than DL380.

                             

                            Another interesting thing that if your app does not have complex calculations like ours, the difference would not be so huge but still it is unlikely based on these tests that 580 can outperform 380 strictly looking at performance measures.

                             

                            I am going to mark my answer as correct since the above was confirmed by Qlik and I really hope it will help someone not to make the same mistake.

                          • Re: please help build a new server - HP DL380 or DL580
                            Boris Tyukin

                            i thought i would post an update...HP was kind enough to take DL580 back and we will be getting newly released Gen9 HP DL380 which has more RAM bandwidth and faster CPUs (v3 line). I will update you guys again once we get it and test it, but so far specs for gen9 looks promising - especially new HP smartMemory which is supposedly runs at 2133 MHz (but at that speed looks like RAM is maxed at 512Gb).

                              • Re: please help build a new server - HP DL380 or DL580
                                JULIAN Rodriguez

                                Hello Borys,

                                 

                                I have read your post and we have a similar problem with a cluster with 5 DL560 gen8 HP servers. We notice that these server are slower than another DL360 gen7.

                                 

                                How was your change?... The performance were improved with the Gen9 HP DL380?... we are considering seriously this option, but it will be great to hear your experience before to make a decision.

                                 

                                Thanks in advance...

                                  • Re: please help build a new server - HP DL380 or DL580
                                    Boris Tyukin

                                    Hi Julian,

                                     

                                    we just got the new server and finished testing it. We bought HP DL380 gen 9 - we were waiting for gen 9 to come out because of DDR4 and newer E5.

                                     

                                    NEW:

                                    2 CPU 18 Core E5-2699 v3 2.3GHz

                                    768GB DDR4 LRDRAM 2333Mhz

                                    (operates at 1600Mhz)

                                     

                                    Out of the box, the new server was 5-10% slower than our old one - here is the spec for it

                                    OLD:

                                    2 CPU 8 Core E5-2690 2.90GHz

                                    256GB DDR3 DRAM 1888Mhz

                                    (operates at 1600Mhz)

                                     

                                    After some tweaks and benchmarks, I settled down on this configuration in BIOS:

                                    hyperthreading is off, Enable HW prefetch, Enable NUMA, virtualization is off, Power set to High Performance, Turbo Boost is on

                                     

                                    I used our the most complex and resource intensive dashboard which uses set-analysis, aggr functions and such.

                                     

                                    After these tweaks, the new server performs about the same as the old one. I have to say that the new server cost us 2.5 times more than the old one.

                                     

                                    I also downloaded and ran some benchmarking tools - GeekBench, Passmark and SiSoftware Sandra. While they are not designed to test servers, I could get the idea of the performance relative to our old server.

                                     

                                    Even though CPU clock speed on our new server is 2.3Ghz, it beat our old server in all CPU benchmarks. RAM unfortunately was much slower in tests - I am attaching my results if you are curious to look at this.

                                     

                                    We also tried to pull out 256Gb worth of DIMMs so mb operates at 2333Mhz clock speed, but it did not make any difference surprisingly for QlikView - it did make some different in software benchmarks though.

                                     

                                    To sum up, while we have now more RAM (768 vs 256) and 4 more cores, I do not think it was worth the price difference.

                                     

                                    HP people are telling us now that QlikView can go through certification process with HP DL380 server - part of this process would be to make sure that QlikView engine uses all the latest innovations and advanced command sets supported now by E5. I will let our QlikView rep know that but unless Qlik is willing to invest time and money in this process, I do not have much hope here.

                                      • Re: please help build a new server - HP DL380 or DL580
                                        JULIAN Rodriguez

                                        Hello Borys

                                         

                                        First of all, thank you for your detailed answer.

                                         

                                        I'm surprised (sadly I should say) with your experience. I was hoping that the new server were better than the old one.

                                         

                                        I have to ask, because I have not it clear: Which server have best performance between these two: DL580 gen8 or DL380 gen9?

                                         

                                        Or definitly the old one is better than the newer DL580 or DL380?

                                         

                                        HP people told you when they will be giving the certification process results?

                                         

                                        Thank so much.

                                          • Re: please help build a new server - HP DL380 or DL580
                                            Boris Tyukin

                                            I have corrected my post as our old server had 8-core not 16 core CPUs - my bad.

                                             

                                            Based on my testing our old DL380 gen7 is the fastest at this point, closely followed by DL380 gen9. DL580 gen8 was way too slow for us which can be explained by QPI thing between 4 processors so I would not go that one.

                                             

                                            We just had a call with HP architects and looks like they will be willing to loan a couple more processors to us for test - they say it could be something because of high number of cores we have on gen9 and if we downgrade core-wise, RAM should work much faster. But it is a speculation that needs to be verified - if we ever get other processors to test out, I will post an update.

                                             

                                            As far as certification, it is something that Qlik should initiate and work with HP - nothing you or me can do here.

                                              • Re: please help build a new server - HP DL380 or DL580
                                                Boris Tyukin

                                                Our final configuration


                                                HP DL380 Gen9

                                                2 CPU 8 Core E5-2667 v3 3.2 GHz

                                                512Gb of DDR4 LRDRAM 2333Mhz

                                                (you could go up to 768Gb but it will force it to work at 1600Mhz)

                                                 

                                                BIOS settings:

                                                     hyperthreading is off, Enable HW prefetch, Enable NUMA, virtualization is off, Power set to High Performance, Turbo Boost is on


                                                Read below for more details.

                                                 

                                                 

                                                We just got E5-2667v3 CPU (8 core 3.2 GHz) and I just finished testing it. We also purchased 768Gb DDR4 of RAM but at that amount it can only work at 1600MHz clock speed and I wanted to test its native clock speed at 2333Mhz.

                                                 

                                                Boy that made a difference! While I saw 15-20% improvement in performance for this new CPU / 768Gb of RAM, once we removed some DIMMs to get down to 512Gb (so RAM could work at 2333Mhz), I saw 30-35% performance improvement compared to our old server and for some sheets as high as 60-65%! The last one was on one of our most popular Summary dashboard pages that has over 75 metrics calculated at once. On our old server it would take 15-20 seconds to calculate that page and now it takes 5-10 seconds, quite an improvement.

                                                 

                                                So my takeaway from this long journey is this:

                                                 

                                                0) Hardware upgrade should be the last resort - use best practices when you build your apps! spend good amount of time on your data model. If it looks like spider web or spaghetti, redo it. Same with expressions - bad expression can even crash your server no matter how powerful it is. In our case, we had a very decent data model and an aggregated version of our most popular dashboard but it was not enough and our user base was growing rapidly the upgrade was justified.

                                                 

                                                1) do not trust the hardware specs in the case of QlikView - more expensive / faster hardware, does not mean it will make your QlikView dashboards faster. 4 CPU DL580 server was 4 times slower for us and 3 times more expensive! Luckily we were able to return it to vendor. Pick your most used dashboard with a lot of calculations and test, test, test

                                                 

                                                2) the more RAM you install, the slower it will be - get high performance RAM (DDR4) and make sure your RAM operates at the highest clock speed it can support

                                                 

                                                3) higher clocked CPUs matter! if you want the speed, get one with less cores, but higher clock. IF you want capacity to handle more users, get the one with more cores, but it will be slower.

                                                 

                                                4) also keep in mind that CPUs with large number of cores (above 8), will have some overhead to use RAM so will be slower (but will handle more users concurrently)

                                                 

                                                5) play/test BIOS settings, I ended up with the following:

                                                     hyperthreading is off, Enable HW prefetch, Enable NUMA, virtualization is off, Power set to High Performance, Turbo Boost is on

                                                 

                                                6) 4 CPU beasts is NOT a good choice - they are slow with QlikView. Go with 2 CPU servers.

                                                 

                                                Hope it will help someone! Took us 9 months and I am finally happy with our choice. I always knew that hardware should be picked considering the software it will run and in QlikView case it was very evident!

                                                  • Re: please help build a new server - HP DL380 or DL580
                                                    JULIAN Rodriguez

                                                    Hello Boris,

                                                     

                                                    I'm glad to read that you have solved your performance issue.

                                                     

                                                    We are tunning the Qlikview applications, spliting one big model on 5 models (one for each bussiness area) and doing Publisher reductions trough some regions or divisions.

                                                     

                                                    At the end, we have improve the performance, with the same hardware.

                                                     

                                                    So, I would add to your summary, that it's important to analyze if the Qlikvew Application is designed as best as possible, or if you can improve it.

                                                     

                                                    Thanks for sharing your experience, and I'll take note about your advises.

                                                     

                                                    Best regards

                                                     

                                                    Julian

                                                      • Re: please help build a new server - HP DL380 or DL580
                                                        Boris Tyukin

                                                        thanks, Julian, yes I forgot to mention that we exhausted our options improving the model - actually I even built an aggregated version of our dashboard and used chained document feature. Another problem, we have 25 dashboards now and 20 developers so it was getting crowded on our old server. I will add a note to my post though about data model / calculations - personally I normally spend 60% of my time on data model which always pays off when you deal with volume above 10MM.

                                                      • Re: please help build a new server - HP DL380 or DL580
                                                        Mohmed Darsot

                                                        What are your thoughts on below cpu in comparison to what you have chosen, have you had chance to test any of them. what are your thoughts on pros and cons.

                                                         

                                                        E5-2667 compare to

                                                         

                                                        E5-2697 v3  - 14 cores

                                                        E5-2690 v3 - 12 cores

                                                        E5-2687 v3 - 10 cores

                                                         

                                                        http://ark.intel.com/compare/83361,81909,81713,81059

                                                         

                                                        • Does # of cores reduce performance?
                                                        • Is it better to have more cores if you want to balance performance and concurrent users.
                                                        • Does base speed matter or do you look at max turbo speed.
                                                        • Does it matter if some of these CPU may not be in tested list of Qlik Scalability Center. is it better to chose from their list?
                                                        • can 512gb Ram be distributed with these CPU in optimal fashion to provide max performance?
                                                          • Re: please help build a new server - HP DL380 or DL580
                                                            Boris Tyukin

                                                            if you want a nice balance between speed (performance) and capacity (number of concurrent users), go with higher clocked (base clock not turbo) but less cores CPU. I am pretty happy E5-2667v3 and you can read my story above for all the troubles we had with more expensive but slower servers.

                                                             

                                                            We did work with Qlik architect from Scalability center and he was aware of our situation and recommended 4P DL580 which turned out a really bad choice for us so be VERY cautious of what they tell you - they are good guys but in the end of the day it is you and your company that matters and only you know what's good for you, so do you own research, especially if you upgrading your existing server. In our case our new 4P machines turned out to be x4 times slower!

                                                             

                                                            I also spent a number of hours talking with a hardware architect from HP - a super knowledgeable guy. He told me right away that 4P machine is not a good choice if you talk speed no matter what and 2P is the only good choice.

                                                             

                                                            When he said that the more cores you get, the more work machine does to handle RAM intensive apps so it becomes slower and there is also direct correlation between # of cores and clock speed - more cores less the clock speed. But at the same time server can handle more concurrent requests.

                                                             

                                                            So ultimately it is your choice. If you are not constrained by $$, i would go with faster 2P machine (meaning less cores and higher CPU clock) and in future you can always get another one and set up a cluster if you ever grow that far.

                                                            • Re: please help build a new server - HP DL380 or DL580
                                                              Boris Tyukin

                                                              as far as RAM configuration, you want the fastest clock speed you can get which means lesser total amount. You can get an idea by using "hp memory configurator" - just google it. Go with less RAM but the fastest configuration.

                                                                • Re: please help build a new server - HP DL380 or DL580
                                                                  Mohmed Darsot

                                                                  Thanks Boris

                                                                   

                                                                  How big is your dashboard (size MB, GB) and how many dashboard are running at same time + how many concurrent users have you tested on your E5-2667 machine.

                                                                   

                                                                  You had 512gb configued at highest speed. Qlik suggest that 384gb is most optimal, do you think 512gb in your case is most optimal configuation.

                                                                    • Re: please help build a new server - HP DL380 or DL580
                                                                      Boris Tyukin

                                                                      sorry cannot disclose exact numbers about users and dashboards, but we have over 50 apps and 10-15 concurrent users. Most apps are pretty small but the most popular one is huge in terms of QlikView - 150 million rows, 15Gb uncompressed with a lot of complex calculations and a nice optimized star schema. I feel like we are pushing it with QlikView.

                                                                       

                                                                      But remember every case is different - it is all about your data model, complexity of calculations and objects so it you have to test your own.

                                                                       

                                                                      BTW if you work with CDW / HP, they will be happy to loan you 2 or 3 CPUs for 30 days and you can pick the best you like with your DL380.

                                                                       

                                                                      as for RAM, i am not an expert really but i noticed than HP and DELL publishes their performance benchmarks they normally do not put a lot of RAM on servers which might get you a hint Personally I would go with max amount of the fastest RAM though - QlikView is very hungry in terms of RAM.

                                                                • Re: please help build a new server - HP DL380 or DL580
                                                                  Mohmed Darsot

                                                                  Hi Boris

                                                                   

                                                                  Just wondering if you gave a thought of just upgrading your CPU from E5-2690v2 to v3 and if you would have seen the same performance. Did you give it a thought.

                                                                   

                                                                  Since 2690 is 10 core and what you chose 2667 is 8. do you see any difference in handing number of concurrent users.

                                                                   

                                                                  We are debating as to what to choose between these 2 chipset.

                                                                    • Re: please help build a new server - HP DL380 or DL580
                                                                      Boris Tyukin

                                                                      it was not an option really for us as we needed more RAM and also were going to split publisher from QVWS.

                                                                      I do not think going from v2 to v3 would give a significant boost in performance, but going from gen8 to gen9 and using DDR4 and faster processing is worth if your budget allows that.


                                                                      As for # of cores - it is a compromise as it was explained to me by HP architect.  The more cores - the more users and calculations your server can handle, but frequency wise (and performance) it will get slower. Top performing CPU (in terms of speed) have less cores and higher frequency. I felt that 8 core 2P configuration was a sweet spot for us but you case might be different.


                                                                      Consider getting trial CPU from HP for 30 days like we did - if you are not happy with it, you can return it

                                                        • Re: please help build a new server - HP DL380 or DL580
                                                          Mohmed Darsot

                                                          Hi Boris

                                                           

                                                          Quick Question:

                                                           

                                                          512Gb of DDR4 LRDRAM 2333Mhz

                                                          (you could go up to 768Gb but it will force it to work at 1600Mhz)

                                                           

                                                          since you are using 2 CPU, does 512gb refer to 1 cpu max memory or 2 cpu combined.

                                                           

                                                          Can you point me to a link where it says you cannot go beyond 512 if you want to maintain highest speed.