Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a scenario of fetching one year data from DB with incremental loading. Which approach among the below will be faster in terms of Load time and performance?
Extracting and
1. Storing a separate qvd for each date(totally 365 qvd's) in the year and loading it into Qlik
2. Storing a Month-wise qvd(totally 12 qvd's) and loading it into Qlik
3. Maintaining a single qvd for the entire year.
Please guide me!
Thanks in advance.
I would opt for :
3. Maintaining a single qvd for the entire year.
Although option 1 would be quicker, but probably not significantly so and to a lesser extent option 2. - but this would normally be overshadowed by the additional complexity of scripting and maintaining.
If data volumes are large, say 10+ million rows a day, as in 3.65+ billion rows / year then I would consider 1 or 2 but only if 3 had be done first and performance was dreadful.
The bottle neck would probably be database latency in retrieving data from the database and I would not be surprised if that contributed to 99+% of the load time.
Note: The numbers above are guesstimates as I know nothing about the details of your scenario.
I would opt for :
3. Maintaining a single qvd for the entire year.
Although option 1 would be quicker, but probably not significantly so and to a lesser extent option 2. - but this would normally be overshadowed by the additional complexity of scripting and maintaining.
If data volumes are large, say 10+ million rows a day, as in 3.65+ billion rows / year then I would consider 1 or 2 but only if 3 had be done first and performance was dreadful.
The bottle neck would probably be database latency in retrieving data from the database and I would not be surprised if that contributed to 99+% of the load time.
Note: The numbers above are guesstimates as I know nothing about the details of your scenario.
Hi,
For me the second option is the best, because:
in the case of a specific historical data recovery, Monthly Qvd's will the easier to maintain, for the daily QVD you will need to loop on them to make a refresh, and for the yearly Qvd you will have to do unnecessary refresh of hundreds or millions of lines.
First option will be the slowest one because of disc I/O. What kind of incremental it is? Can data change in any historical datapoint?
Tomasz
Dear Jeshwin,
If your database have is so fast to recover informations and daily update only, I guess full load will be more easily.
But, if you want to understand a field values changed, daily incremental load will be required.
Enjoy
Neither options is right or wrong. To me two important aspects for the QVD granularity is the QVD file size and the QVD ingestion.
If apps consuming the QVD file(s) end up loading the file as unoptimized due to data transformation, then there might be room for altering the granularity. Keep in mind that data can still be filtered in an optimized fashion, for example by using the Exists() function.
Very large QVD files (+10GB) often end up being filtered during load, which naturally means they could have been pre-filtered by slicing the data into multiple files instead.
If you actually require very large volume of data (+10GB) in the same analysis, there are often better ways to slice the data during user access, for example with ODAG or Sessions apps. Then again, theses techniques also enable using smaller QVDs as they will filter the data as it is made available to the business user.