I am creating a report that displays daily data aggregated to that month
over a period of a few years. Each month, it has to summarize about 10-25
million records of daily data for that month and add to the existing
summarized report. I have 3 stages of doing this : 1- Extract from source,
2 - Process the QVD from step 1 by transforming it to a preset QVD tables
structure with dimensions etc. The data is at the same level of detail of
step 1 here. 3- Summarize it to show monthly totals. I will be loading this
incrementally. Here is my dilemma - which of these approaches is better in
terms of memory and time usage?
a) Select records for this month from source table and concatenate to last
months' QVD? So, the result of this is a 1GB file at step 1.
or
b) Select records for this month from source table and save it to the QVD.
Then, do incremental loading by doing an insert/update based on existing
Step 2 QVDs. So, the result of this is 1GB file at step 2.
Which approach should I go for?
TIA