We have bought 5 Named CALs and a copy of Qlikview Server (all version 9.0) and are struggling to get started.
There is a guy here who is tasked with developing the reports and my job, as IT mgr, is to setup the infrastructure for him.
I've set up Qlikview Server and given it ODBC access to the relevant datasources. I've also configured the ODBC datasources on the report writers machine and installed the desktop client on it.
Now, he creates a new document and edits the script to bring in the data. However, our overall dataset is 15Gb+ and his machine runs out of virtual memory as I believe he is doing 'select * from bigtable'
Our FD believes that our setup is wrong and that I should have brought the data into the Qlikview server. He believes that once that is done, the report writer should be able to connect to the server, using the client installed on his machine and generate reports without any memory issues.
My belief is that the report generator creates the document on his machine, using the local datasources and has to carefully craft the required SQL to bring in the data he wants (so that he doesn't run out of memory). Then, once the report is ready, we copy it to the server where it runs, periodically bringing in data from the ODBC datasources for the users to view.
Which is right?
If I am right, how do people develop reports where you may have millions of rows and many 10's of Gb of data?
We are fairly new users to QV also. We have many disparate data sources and one of the attractions of QV was the fact that once loaded all the data is stored in the QV file (.qvw) giving fast access, impressive rates of compression and cuts down network traffic.
We develop large applications on our dedicated development server rather than a desktop. Although our desktops will handle viewing reports they can sometimes struggle when developing larger applications.
What kind of data are you connecting to, is it flat file, or a database?
This won't answer your question but I can provide some tips that we have learned which I hope will benefit you.
Depending on your data sources try OLE DB rather than ODBC connections.
Review the machine you are developing on, will it benefit from an upgrade?
During development use the Debugger and limit the load to a few thousand rows and only do a full load once the report is due for testing.
We also try to do as much as we can in SQL and construct views that feed QlikView.
We have also noticed that the qvs process on the server can become very large and benefits from the occasional restart.
Thanks for your reply!
We are connecting to several databases MySQL, MSSQL etc. All over ODBC.
The desktop that the finance chap is working on has 3Gb RAM (WinXP, so that's the limit) and he isn't really familiar with SQL, so I think he is setting up the document to do a "select * from bigtable".
The Qlikview server has 8Gb and we can allocate more if needed, so it might be best for him to work on that.
My sympathies ... we have had a lot of difficulties getting going as well ... not due to data volume but certainly table relationship complexity.
Swapping to OLE DB would be a good start as mentioned and developing on the server will aid in reload. You will notice a big difference if you reload from the server not the desktop. I use RDP to make this easier for myself.
Due to our troubles we also decided to build everything that we would need externally and save the hassle of concatenation, link tables, joins etc in the QlikView script. Building the views in SQL mean you can be absolutely certain of the result in QlikView. With one caveat ... keep tight control of your field names and plan the relationships between tables in advance.
You may also want to investigate loading as binary data to aid with performance.
If you open a QVW on your machine, regardless of where it is stored, it's in memory on your machine. If you reload from your machine, it's doing all of the work and storing all of the data in RAM on your machine. So for development, you are going to be limited by the capabilities of the machine you are doing your development on. Storing the QVW on the server, while perhaps a good idea from a centralization and backup perspective, will not resolve this problem.
So you're both right. I would suggest storing the QVW on the QlikView server. A developer can then connect to the server, using the client installed on the developer machine, and modify the QVW. However, this will NOT solve client side memory issues. As you believe, during development, the document is on the developer's machine and is using local resources (RAM, CPU). The developer should absolutely carefuly craft the SQL to only bring in the necessary data. Select * from bigtable is a big no-no. It's a massive drain on your DBMS. It's a waste of memory, regardless of where it's taking place. NEVER select * from bigtable. Once the QVW is ready, if it was developed locally it should be moved to the server. Then it is the server copy, using server resources, that would be scheduled to run periodically.
How do we develop with large data sources given these restrictions? We've dealt with it in two ways, sometimes used in combination.
First, most of us have 32-bit development machines with only 2 GB of RAM. But we do have ONE 64-bit development machine with I believe 4 GB of RAM. So for larger documents, we remote desktop into that one machine, and do our development that way. We only have a few documents that require this, and most of them are mine, so we've had very few conflicts. When we have two people wanting to use the machine at the same time, we just communicate verbally and decide who's task is higher priority.
Second, we can limit the amount of data used during development. There are many ways to do this, some already mentioned. As another example, for one application, I've created a trivial QVD in production and test that gives me a starting date for the data I load. In production, it refreshes automatically from a database source. In development, the QVD currently stores the beginning of this year and is not refreshed automatically. So when testing, I'm only loading this year's data, which speeds up the load and prevents memory problems. When I install, it will automatically pick up the full intended data set without any modification of the script.
While not directly related to your question, consider as a general rule NOT doing any ODBC or OLE DB connections in the user applications. Consider instead making specialized applications that do the connection to a single or related tables, read in the data in a generic way, and then store the data in QVD files on the server. QVD files can be read VERY quickly by QVWs, so serve as a way of only hitting your database once, and then letting multiple user applications use the data without needing to hit the database again. These QVDs, at least if carefully designed, will slowly grow into a data warehouse for your user applications. Create them only as needed, but keep in mind the big picture when you create them. Have some idea where you're going and how all the pieces will connect.
Could you just clarify one last thing. When opening a qvw thorugh IE on the server ... e.g. qvp://etc. is that data loaded into client RAM or is it loaded into server RAM ... my understanding is that it is server RAM.
Hi Lee, when using the IE plugin you are simply viewing the (potentially huge) data model/qvw hosted on your QlikView Server. It's rather like looking through a window, so all you need client side is enough resources to run a browser and to render the QlikView objects (which isn't much).
On more of a technical note, I am trying to think of the single biggest resource intensive client-side operation a user could do when using the IE plugin..... and I think the worse offender is exporting data on the client side (i.e. exporting to excel). If your QVW contains a pivot chart, straight table or table box that contains a lot of data (rows), it can put load on the client when they try to export this to Excel. It's not a huge consideration but the only example I can think of where the client side hardware does come into play. For general QlikView power though, all of that hard work is done server side.
ps: I know you asked John but I couldn't help but reply
Ah ... thank you for that ... I have noticed that where we run into problems is when we have a straight table that has a large number of rows that pulls data from about 7-8 different tables. I can not concatenate them any more due to certain constraints in the data.
Combined to that their are various date based filters in the same chart that require the server date time ... I have reduced this requirement to reload date time but that does seem to hammer resource.
As _qlikbox said, for IE plugin, the data is on the server. The same should be true when you're doing an open in server, or opening the document in Access Point from any of the client options. The data only resides in client memory for development, when you're actually opening and modifying the document, running script, and so on.
If you're using the reload date time, the date filters should only be an issue if you're implementing them with an if(). If you're making selections or using set analysis, that shouldn't cause any problems that I can think of.