Discussion Board for collaboration on QlikView Scripting.
Hi I've got a QV document which currently has a main table with approx 25 million rows and about 80 columns wide. As the amount of data has increased and some of the formulae have become more complicated the performace has suffered. We initially increased the RAM and cpu, but with the latest increase have suffered a performace drop again, but this time out IT dept can not increase the resources without significant spend (we currently have 8GB RAM and 4 dual core processors).
My question is, are there any general rules about how the data is stored? The data comes from several fact tables in SQL, so a lot of the columns are not applicable and will be null. As well as this when pulling from SQL the fact tables are joined to the Dim tables, so QV will have a column for Date, weekdayname, week commencing, month commencing, year etc. Would this be better to leave in it's original state?
I am familiar with linking and indexing tables, and how to store the data in SQL, but need some pointers for QlikView. Does it matter what order the data is loaded, is there a sort / index function, do the calculations run better in one big table or two smaller ones?
QVD files are a great way to improve a Qliview model's performance.
The scripting for it is not too complicated, and can be quite powerful when working with massive amounts of data.
Great for 1) Increasing Load Speed
2) Decreasing Load on Database Servers
I've attached part of the Manual on various ways to create QVD's.
Hope it helps.
Hi Xena, I already use QVD files as part of the load process, as you say these load much faster, I then pull new data from the server.
Once loaded it is the amount of RAM being used and maxing out the CPUs when QlikView does the calculations that is the issue. My main thought was that the data would be better in multiple small tables all linked together, rather than fewer large tables, but this will be quite a big rebuild, so I was wanting some guidance and advice as to if this would be worth persuing.
QVD files can greatly improve the reload performance, but they have nothing to do with the run-time performance.
As for the relation between the data model structure and the run-time performance, my impression is that starting from QV 8.50 and up, it is better to have a small number of large tables. More important is that the data model should be a star schema - one fact table and a number of dimension tables. Certainly the complexity of the expressions on the front-end objects plays a role. Maybe it makes sense to set up some calculation conditions and force users to make selections.
In general, even very large appplications with a star data model performs well on a QlikView Server on a 64-bit machine.
Agree with Michael. An application having some 20 Mio records over 16 fields can be handled locally on standard PC with 2 GB of RAM, thus your datavolume should not a problem for a server. Recommend rather to look at the scripting and try to optimize there.
Also try to do as much calculations as possible while loading the data and writing to qvd-files. After that the calculations will just be more data in memory and will perform much, much faster. Not all calculations can be handled this way (obviously), but certainly more of them when using a star-schema to create a single fact table.
You probably already have things like 'Month(date) as Month' or 'Name&Lastname AS FullName' in the LOAD script, but you might be able to add some calculations as well.
some tips that I commonly use:
1- check the columns that haven't being used and remove them
2- Use QlikView Optimizer and pay attention in the top fields (sorted by BYTES). Try to reduce it (are they neccesary?, can I use autonumber?)
3- Try to use Autonumber for those fiels that relate the tables
I don't know if you thought on it, but you could start your application selecting the current month and year.
I had used that in our application. Create a macro that set the current month and year when it is started.