We are still in development stage, but just trying to think ahead. That siad, we eventually will have 36 months worth of data, and each month will have roughly 300 million rows, so I thought by breaking my original QVD file by YearMonth would make it easier to manage and maintain, and hopefully improve query performance.
I am also worry about user experience. Obviously, we will do what we can on the server side in terms of RAM and CPU, but I am just wondering if there is anything else we can do to improve performance or what QV best practices are when dealing with large volumn of data. For example,
- does partitioning QVD into smaller ones help with user's queries?
- If we have 36 months of data with total of, say, 500K account_id. What happens when a user select or drill down to one account_id and one month? Does it require QV to search through the entire record? Or QV is smart enough to know what to look for.
- can we optimize QVD files?
p.s. I am still a QV newbie, so if my comments/questions do not make sense, I sincerely apologize.
QVD exists only when you store and load it in script. It affects only loading performance.
So for better user experience you have to optimize your data model instead.
You could try to answer the questions: Do you now the requerments ? Do you really need all that data for everyone ?
Also, try to do some preformance tests.
There is some info in the community, for example:
Keep in mind that some recommendations strongly depend on context.