Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

please help build a new server - HP DL380 or DL580

Hi guys,

I am looking for a second opinion as we are already working with our vendors and Qlik reps. We just purchased a new server and I found out that it performs terribly with QV even though it is 4 times more expensive and powerful. We are going to return this server to our vendor (HP) and get something different hence I am asking for an opinion on an ideal and fastest for QV HP server.

Right now my company uses a single server to run all QV services, including QVS, QDS and AccessPoint. It is very fast but we are running out of RAM and it is not expandable so we are in the process of getting a second server for QV which will only run AccessPoint and QVS.

Our old server:

2 socket HP DL380P GEN8

Intel Xeon CPU E5-2690 2.90GHz

256Gb of RAM

Our new server (which we are going to return):

4 socket HP DL580 Gen8

Intel Xeon CPU E7-4870 v2 2.30GHz

1Tb of RAM

I was super excited about DL580 and when it arrived I started running some tests. We have a fairly complex dashboard with 90+ million rows and a lot of extensive calculations and I found out that our old server was 2-5 times faster on the most of the tabs. We tuned the server using QV tech. paper specifically made sure that max. energy profile is used, hyper threading is off and NUMA is off. Turbo Boost is on.

It helped a little bit especially then we disabled NUMA – I am confused why though because according to QV, NUMA is not a problem on 11.2 but apparently it is. So after we disabled NUMA, things improved maybe by 10-20% but it was still a way slower than our old server.

We reached to HP and one of their senior architects explained that DL580 has to do more work with RAM because of the way RAM is shared between processors and the clock speed is 1333 vs. 1666 on DL380.

At that point I downloaded and ran 4 different benchmarking tools (MAXXMEM2,NOVABENCH, PassMark and SiSoft Sandra) and all of them showed 2-6 times difference in RAM tests – DL580 was slower again!

Our goals for a new server:

  1. 1) Should be faster than the old one
  2. 2) Should be expandable to 1Tb of RAM at least
  3. 3) Should support at least 50 concurrent users
  4. 4) Should be made by HP – rack server HP Proliant family.
  5. 5) Our budget is 50k
  6. 6) The server will run ONLY QVS and AccessPoint.

I apologize for a long introduction to my question but now I am puzzled what HP server we should pick since DL580 with 4 processors clearly does not meet our needs.

I am not a hardware expert and was relying on our hardware people and Qlik but apparently the configuration they picked did not meet our goals. They are working again to revise the config but I wanted to get a second opinion from a forum and you.

I am thinking now to either get the fastest E5 or E7 and this time only two sockets to minimize memory hops. Also I wanted to see if we can use higher clocked RAM (1887?)

It is going to be a rack server from HP Proliant family – you can actually build it online here

http://www8.hp.com/us/en/products/proliant-servers/index.html?facet=ProLiant-DL-Rack

Any suggestions are highly appreciated!

36 Replies
Anonymous
Not applicable
Author

I tried that too but things got much slower in my case though. HP recommended to physically disable 2 processors in BIOS or pull them but this will also disable half of RAM we purchased.

Not applicable
Author

Borys, do you mind posting your data schema here? May be a few expressions too.

There are many things that can be done to improve the performance of a Qlikivew app. May be the collective mind will be able to suggest a few things that might help in your case.

90 million rows is minuscule 🙂 For instance I have over 7 billion rows (but not that many users).

Anonymous
Not applicable
Author

Hi KlickiBunti, thanks for your offer but we have many dashboards and all of them work much slower. The one I was testing with 90MM rows is the most complicated though and it took 2 years to develop it. A lot of financial calculations and metrics with a Health care twist so I cannot possible explain all of it. Please be assured though most of the best practices are already in place.

the problem is not the number of rows - I have 1 billion rows app which is blazing fast but complexity of calculations is really low. If all you do is straight sums - it is one thing, if you have to do nested aggrs and dynamic set expressions - totally different. We already spent a good deal of time optimizing it but it is super complex and our users require raw level of details which does not help of course.

Not applicable
Author

A few thoughts (and I may have made some implicit assumptions - so correct me where I am wrong) :

The 4-socket architecture should only be slower if it has to access the memory in an inefficient manner.

So I wonder why would the memory architecture play a role if your dataset is rather small row-wise? They should all fit nicely into one NUMA node (256GB), shoudn't they. You want to have a 1TB of RAM - how come 90million rows amount to such a big dataset?

If these are rather the user caches that are not managed correctly - how about testing that? How about creating 10 documents with the same content, but partitioning your user base to access them in cohorts.

Now if 90 million rows are big because they contain many columns - may be you could split it into a star schema with multiple tables? So the columns that are used for calculation are separated from text fields etc. So the calculations will not have to deal with those columns thus reducing inefficient data transfer.

Another thing is you might want to check is whether you could exchange the data size for the calculation complexity. May be you could pre-calculate a few things or create specialized tables. In other words how about denormalizing your dataset to speed up calculations. If you only have 90 Million rows - extend it into a dataset that has 9 Billion rows but has more or less anything you need to calculate your stuff in a more straightforward way.

This might be totally irrelevant to your situation but I just want to throw that at you. Just for the sake of hearing your reaction if nothing else.

Anonymous
Not applicable
Author

sorry if I was not clear, but we have more than one app and the app I am testing does not take 1Tb of RAM. That app takes about 4-5Gb of space compressed and 20Gb uncompressed in RAM. I also tested a few others apps - they all were very slow.

I understand about app optimization but at this point I am just trying to make the new server work at least with the same performance. That app I am testing is very complex and we already spent a lot of time optimizing it using different techniques. I already preload everything I can but a lot of metrics are not additive and cannot be calculated in a script unfortunately.

Not applicable
Author

Hi Scalability Team,

We have done the tweaks which are mentioned in above threads. We also raised support request 00330211 - Qlikview compatibilty issue with HP DL580 Server. Team says that they cannot help on this scalability team could help to resolve the performance with HP DL580 servers.

Could somebody suggest what should be the next steps taken or the server is incompatible with Qlikview.

Thanks,

Sagar

Anonymous
Not applicable
Author

Hi Sagar,

I worked with QV on this and went through their benchmarks. They analyzed results and came back with the bunch of charts and confirmed that our DL580 performs similarly to other DL580s they tested.

They told us that DL380 is a much faster server but not that capable if you have more than 100s of _concurrent_ users  and this there DL580 will shine. They told us DL380 is like a Ferrari and DL580 is like a school bus - first can only ride two people and very fast, while school bus can ride 100 people but slow.

I really wish we knew this before we've made the purchase especially because Qlik was involved in this deal and architect from Qlik suggested this server to us since we asked specifically for the best hardware configuration. They also knew the complexity of our app and our business case.

So now Qlik suggested to get DL380 but the money was spent and looks like we are stuck with super powerful 4 times more expensive server which is 4 times slowed than DL380.

Another interesting thing that if your app does not have complex calculations like ours, the difference would not be so huge but still it is unlikely based on these tests that 580 can outperform 380 strictly looking at performance measures.

I am going to mark my answer as correct since the above was confirmed by Qlik and I really hope it will help someone not to make the same mistake.

Not applicable
Author

Thanks Borys for your quick response.

Anonymous
Not applicable
Author

i thought i would post an update...HP was kind enough to take DL580 back and we will be getting newly released Gen9 HP DL380 which has more RAM bandwidth and faster CPUs (v3 line). I will update you guys again once we get it and test it, but so far specs for gen9 looks promising - especially new HP smartMemory which is supposedly runs at 2133 MHz (but at that speed looks like RAM is maxed at 512Gb).

julian_rodriguez
Partner - Specialist
Partner - Specialist

Hello Borys,

I have read your post and we have a similar problem with a cluster with 5 DL560 gen8 HP servers. We notice that these server are slower than another DL360 gen7.

How was your change?... The performance were improved with the Gen9 HP DL380?... we are considering seriously this option, but it will be great to hear your experience before to make a decision.

Thanks in advance...