Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello everyone!
I'm facing a challenge with loading multiple QVD files, specifically fact_data_2019.qvd, fact_data_2020.qvd, fact_data_2021.qvd, and so on. While these QVDs share most of the fields, the newer ones include additional fields, preventing a straightforward concatenation with the older files by using the wildcard in the load.
To work around this and force concatenation of these QVDs, I've usually created an empty table and force concatenation as follows:
FACT_TABLE:
LOAD * INLINE [
delete
]
;
CONCATENATE(FACT_TABLE)
LOAD
*
FROM [qvd_path/fact_data_*.qvd](qvd)
;
DROP FIELD delete;
This approach successfully concatenates the data, which is great! However, it doesn't seem to be QVD-optimized due to the concatenation, and the load time is quite long since these QVDs often exceed 2GB.
My question is: Is there a way to achieve the same result (all QVDs concatenated) while leveraging an optimized QVD load?"
Note: older QVD's cannot be modified to add the new fields.
Thanks in advance.
Greetings,
Alex
I think you might be able to use a double load here and keep the optimized load, though I'm not sure. If you can't, I would just run a two-pass load - one optimized load to get the data and then a second load to concatenate (from the resident table).
I believe the trick here is that in your first dummy load, load the entire set of fieldnames that will appear in the QVDs.
-Rob
Another idea is to once perform the load you are already doing with all years including last year (but not this year). Store the table into a new qvd (e.g. fact_data_history.qvd). Next time, load that qvd (it will be an optimized load) and concatenate fact_data_2025.qvd into it (not optimized). This will probably be much faster. (A prerequisite is of course that the qvds for previous years are static and don't update.) As soon as 2025 ends and you have a new 2026 file, you need to make sure 2025 is integrated into fact_data_history.qvd and you will now concatenate history qvd with the qvd for 2026 in each load.
You don't have to create a dummy table first with this approach, and therefore the load of the large history table will be optimized.
I think you might be able to use a double load here and keep the optimized load, though I'm not sure. If you can't, I would just run a two-pass load - one optimized load to get the data and then a second load to concatenate (from the resident table).
I believe the trick here is that in your first dummy load, load the entire set of fieldnames that will appear in the QVDs.
-Rob
Another idea is to once perform the load you are already doing with all years including last year (but not this year). Store the table into a new qvd (e.g. fact_data_history.qvd). Next time, load that qvd (it will be an optimized load) and concatenate fact_data_2025.qvd into it (not optimized). This will probably be much faster. (A prerequisite is of course that the qvds for previous years are static and don't update.) As soon as 2025 ends and you have a new 2026 file, you need to make sure 2025 is integrated into fact_data_history.qvd and you will now concatenate history qvd with the qvd for 2026 in each load.
You don't have to create a dummy table first with this approach, and therefore the load of the large history table will be optimized.
Hey, @rwunderlich
Thanks in advance for your help.
I created an inline table with the same fields as last year’s QVD, and the load was optimized as you suggested. However, when I tried to load multiple QVDs using a wildcard, the load process was no longer optimized (even for that QVD with the same fields). From what I understand, as soon as two QVDs contain any differences in their fields (with the inline table) the optimized load is lost.
Appreciate your time.
Best regards,
Alex
Hey, @henrikalmen
Thanks for your suggestion — it’s definitely an elegant approach.
In a different scenario, I’d consider implementing it. However, in my case, each QVD is around 2.5 GB, and there are currently six of them.
Greetings,
Alex
Hey, @Or
I don’t know if I fully understood your approach. The problem is that we are getting new fields every year. Therefore, the load would just be optimized for this last year.
Thank you for your response.
Greetings, Alex