Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Loading Big Datasets

Hi,

I am facing a problem wherein the query returns about 200 million records from a DB server and I am trying to load this into a qvd. It takes about 6 hrs to load 13 mil records.

The query is fast ans returns in under 20 mins. The bottle neck is in serving the data from DB into the QV.

Are there any means of speeding this up?

Thanks.

9 Replies
tresesco
MVP
MVP

Increase hardware(RAM, CPU) capacity.

AbhijitBansode
Specialist
Specialist

Data load time is depend upon the hardware configuration of the machine on which QV is installed.

Also try to optimize your load process by implementing incremental load.

Do you really need 200 million rows of data in QV?

jagan
Luminary Alumni
Luminary Alumni

Hi Gokul,

You can try incremental loading, instead of loading all the data daily you can load only current day data and store the previous data in QVDs.

Refer Qlikview help file for incremental loading or in community.

Loading data from QVD is 100 X faster when compared to database, also you will pull less data from database when you are implementing incremental loading.

Regards,

Jagan.

Not applicable
Author

Load data in chuncks ...might be it will help you...

Not applicable
Author

Thanks Tresesco and Abhijit. Unfortunately, we cannot increase the hardware.

Do you think incremental loading is going to help? Incremental loading is useful when we are not able to execute the entire query and we break the dataset and concatenate the result. Because the query has returned and its just loading records I am somewhat doubtful.

avinashelite

Hi,

Try to follow the below points:

1.Eliminate unwanted columns this will reduce the data.

2.Try to aggregate the data as much as possible.

3.Try to push all the calculation to the script level.

4.If data is to huge then try to divided it as multiple qvd on yearly or region basic etc

Hope this helps you 

Not applicable
Author

Hi

I want to add another point Split the data in multiple Script and run it parallel Process and generate multiple qvds. It will also reduce total data extraction process.

Thanks

Not applicable
Author

Hi Sovan, I think parallel processes will help here.

Could you please provide an example of how to parallelize load scripts?

AbhijitBansode
Specialist
Specialist

Create different extraction applications to load the data from database.

Lets says you have 200 million records, create 5 separate app to load the data (40 million each) .

(You can increase no of applications depending upon performance). Have to specify the clause to distinguish this data.

Store them in separate qvd files and then you can load all the qvd files in single application and concatenate its data.