Tips for a better way to reload lots of QVD data - Page 2 - Qlik Community

peter_turner · ‎2009-09-01

Hello Everyone,

I am loading 2 million+ records from 100+ QVD files (300mb), but this takes awhile.

At present, each time new data is available, I grab it and save it to a QVD (which is fine), then load * QVD's.(which is where the problem is)

My question is, as the main 'data' table is already in ram, what's the best way to incrementally load/add/partial reload/join just the new qvd file?

At present I'm using something like...
NewData:
load new data from DB where timestamps etc etc.
save NewData to qvd123456.qvd
drop NewData.

AllData:
load * .qvd

My project is reloaded by the QV9 server, 32bit.
Would my main 'Data' table be stored in the QVW (which is 120mb), and what's the best way to add the new data to the start of the 'Data' table without reloading the whole lot every time.

Peter.

johnw · ‎2009-09-21

I haven't fiddled with chaining events together in QV9 server. My understanding is that it can be done, but is a bit more complicated than in earlier versions. Something about setting up prerequisites for each job, I believe? So each job would have a prerequisite of the previous job.

prieper · ‎2009-09-21

Have not done that much of batch-handling in QV though, but in a first approach would try to trigger this with variables / flags in the script and a loops waiting for this condition to become true (or via small .txt-files containing the variables in order to hand over probably more detailed info).

Peter

Report Inappropriate Content · ‎2009-09-21

Few comments..

Oleg suggestion generally sounds good. I have used same kind of setups my self. Combining hundreds of QVD's into one is good, because it is much faster that way. It always takes time to open a file.

In phase 2. If you need to do calculations for the whole raw data set, it is usually faster to store it into temp qvd file and read it again instead of using resident load statements. It is faster and does not take so much memory.

With big data sets you should avoid using joins when possible. It is usually possible to convert them into mapping statements which is more faster and does not take so much memory than joining 2 big tables.

I had a customer who had a big file which at worst during reload tooked about 42 Gb server memory and lasted about 10 hours. After converting joins into mapping and using temp qvd files instead of resident loads it needed about 1,5 Gb memory and lasted about 3 hours. And we could make it still much faster by using Oleg type of setup.

Hope this helps you..

Mikko Vasanko

Senior Solution Consultant

Qliktech Finland