I'd be guessing and testing just like you. I wouldn't think the order of concatenation would make a difference, but maybe it does. Maybe it takes extra work to "go back and append null fields" to the table you have so far. Or it might not take any work whatsoever, depending on how QlikView behaves internally. It does seem at least worth testing loading an empty skeleton table with all of the possible fields, just to see if there IS a slowdown when new fields are added. It looks like you already have optimized QVD loads with no conditions. Anything you can do to eliminate or shrink fields might help. I just don't have many ideas.
The slowdown you are seeing from the common Sales to Inventory is the switch between an optimized load (indicated by "(optimized") message in the status window). The INVT_Table1.qvd will load unoptimized which is considerably slower.
Here's my empirical understanding of optimized loads when concatenating QVDs. The rule is: A concatenated qvd load will be optimized if it loads all the same fields loaded in all previous loads of the same table. It may add fields as well.
Consider these examples.
QvdA.qvd -- Fields A, X
QvdB.qvd -- Fields A, B
--Optimized load --
LOAD A FROM QvdA.qvd;
LOAD A, B FROM QvdB.qvd;
--UnOptimized load -- because X does not appear in the second load.
LOAD A, X FROM QvdA.qvd;
LOAD A, B FROM QvdB.qvd;
Here's some possible workarounds:
1. Change to a linked table model instead of a concatenated model -- which looks appropriate given your data sample. You'll have to generate compound keys to eliminate the synthetic keys.
2. Populate null() fields into the QVDs so the field lists are symmetrical.
I can validate your response -- here is a smaller version of the load (in progress -- still has 45 minutes to go): you can see the first Sales tables load optimized and then the inventory (Invt) loads are not optimized. There's a good 2 minute pause in the transition and the data load slower afterwards. Based on your observation, looks like I lucked out in loading the larger Sales tables first so they would load 'optimized'.
Concerning the linked table, I have tried that for another project. Though it does link things together, the performance is significantly worse with that data model. As our applications have increased in size and complexity, more and more we find better performance (front-end) with one Master Table that is a concatenation of disparate tables.
I will see if I can populate with null values so all tables load with the same data.
Thank you for sharing your insights!
Rob Wunderlich wrote:1. Change to a linked table model instead of a concatenated model -- which looks appropriate given your data sample. You'll have to generate compound keys to eliminate the synthetic keys.
Or just allow QlikView to build the synthetic key, which I suspect here would perform the exact same function as a compound key, but with less effort and higher performance.
Edit: Posted that before I saw your response to Rob. I'd at least check the performance using synthetic keys, but based on your experience, you'll probably get your best performance out of your current plan to add null fields to the various QVDs.
Tyler Waterfall wrote:I was under the impression that synthetic keys are bad, both to performance and to keeping the data straight. Is that a true statement?
While a very good rule of thumb, particularly for new developers, that is not a true statement.
Most of the time, synthetic keys arise through mistakes or inefficiencies in a data model. As a result, synthetic keys usually occur in concert with poor performance and other problems. Clearing up the synthetic keys often simultaneously clears up the data model mistakes and inefficiencies. The poor performance and other problems go away. This can lead to mistaken conclusions, that either synthetic keys CAUSE problems, or that they ONLY occur when there are problems, and that they should therefore always be avoided.
But synthetic keys are not, in and of themselves, a bad thing. While the situations are likely rare, in some situations, they are exactly what you want, and the most efficient way to accomplish a task. I give an example with performance testing results in the following thread:
For your particular case, I'm guessing (I could certainly be wrong) that a synthetic key linking together all of your tables would be faster than building a concatenated key yourself, both for load speed and to a lesser extent for chart processing speed.
However, for your particular case, the fastest and I would think best solution seems likely to be Rob's second suggestion - to add null fields to your QVDs.