Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

QVD File Size Limit

Hello All,

I have a scenario in which I would need to aggregate large amount of data (about 11 million rows for 2 days). I am not able to load this data into QVD and it kind of takes forever.

However, the bigger problem is I would need to aggregate it for a month. Thats roughly 100+ million rows !!

Any suggestions about how to handle this scenario would be greatly helpful!!

Thanks.

9 Replies
suniljain
Master
Master

100+millions rows is not huge volume . Try to handle aggregation in script itself  and push maximum business logic in back end .

Not applicable
Author

Hi Sunil,

Thanks for the answer. The 100 mil really is the result set after aggregation(far more rows in the actual tables). Also all aggregation is done in the DB. I am just loading the aggregated result from DB into QlikView and then into QVD. And still I am not able to accomplish this in any reasonable amount of time.

How long does it generally take to read say 10 million rows of 20 columns into QV?   

tresesco
MVP
MVP

It could depend on many factors, like your RAM size, CPU, db source(location), network trafiic, calculation complexity in the script to mention few. It could be as fast as 2 minutes; it could be never-ending(like in your case). Therefore, you should probably look into those factors and try to improve there.

Anonymous
Not applicable
Author

Are you aggregating data at the time of load  into Qlikview (in select query) ? or aggregated data is already loaded to DB table and you are just fetching that data with simple select query ?

Not applicable
Author

Yes, Select query is aggregating. The script is something like the following:

Load *;

Select x,y,sum(z)

FROM A,B,C

WHERE ...

Group by x,y


But it is my understanding, that the Select is handled at the database and and Qlikview just loads the results. Please correct if I am mistaken.

Also, in some other BI tools, there is an option of dynamic querying. The reports have a prompt page and depending on the values selected we will be able to filter the tables in the query. In other words, we never query the entire table only a subsection of it. Can we do something similar in QV?

Thanks.

tresesco
MVP
MVP

Gokul krishnaa CB wrote:


But it is my understanding, that the Select is handled at the database and and Qlikview just loads the results. Please correct if I am mistaken.

Yes you are right here. However, the time being taken by your db still would be counted in qv load for the script:

Load *;

Select x,y,sum(z)

FROM A,B,C

WHERE ...

Group by x,y

Because, qv would only work on returned set of data. Try to run the same query in the db directly and see how long it takes.

Anonymous
Not applicable
Author

ok .. so you can just try to load all the data in qlikview with simple select query and then do aggregation in qlikview with resident load and then store resulted table to qvd.

Lets see if your performance is increased . Because your DB query is taking time because of aggregation function .

Also try to see the time taken by your simple select query in database.

Not applicable
Author

Thanks Tresesco. But my doubt is, does accessing the db via QV(OLE DB) somehow increasing the load time?

The following post is interesting in this regard.

Incremental Loads are dead, long live the CSV

I am not sure if this works as well.

Balkumar, I think joining in QV is bound to take more time than if we do it in the native DB.

tresesco
MVP
MVP

That is very much a possibility. Thanks for the link an interesting one to play with.