Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Dear Friends,
I have 2 Fact tables with equal number of columns coming from 2 different subject areas. I have concatenated both tables and storing the concatenated table as QVD. I observed that size of the QVD is almost 3 times of individual QVD sizes of those 2 facts.
Fact1 QVD Size if I store individually: 32 MB( record count 1.3 million)
Fact 2 QVD Size if I store individually: 168 MB ( 9 Million)
After I concatenated 2 tables , Combined QVD size is 486 MB.
Please suggest how I can reduce the size.
Br,
Chinna
Could you post the meta data (XML-part) of your three QVDs?
If your field values are quite different in both QVDs (so it needs more bits to represent the distinct values in the facts and maybe as well more bytes to represent the symbols, like for mixed number / text representation), this may explain some of your size increase.
Lots of %IDs there. Any chance they were created with Autonumber during the QVD creation steps? Because if so, concatenating the tables will break consecutive integer optimization. Maybe
In addition, you are creating some of the fields in your load with a constant value.
Are you comparing your numbers using the original QVD sizes or after transformation, but not concatenated?
Also note that the constant field value optimization (QV does not need to store a bit pointer for a constant field value) doesn't kick in if you are concatenating the table with records with distinct values. Since you are using quite a lot of these fields in either table, this might explain very well the increase in fact table size.
No,we haven't created the %IDs using autonumber. Those are direct key values from dimensions.
However, These QVDs are not straight forward ones. These are transformed QVDs.
without constant values also it is giving same size. I have put the constant values just to enforce auto concatenate .
Dear Stefan,
Thanks for your time.
How to extract the XML part??
Br,
Chinna
Open the file with a text editor. And copy the xml part. Basically the part that's readable.
I think the key point about the constants is that you are not only concatenating the tables, you are also making each of them wider (by adding more fields). Especially the second, larger table, where you add over 10 columns. This widening has its own cost in space, no matter if you fill the added columns with zeros, empty strings, nulls (which Concatenate would add automatically if you did not assign them manually), or whatever else - at least that's how I understood Stefan's remark.
You could not test it by adding the columns to each QVD and storing the results separately, because QlikView would wise up to the fact all values in the column are for example 0 and use nearly no space because of that. After you actually concatenate the tables, it's no longer the case (QlikView cannot use this single value in column compression trick anymore) and the size grows.
The part is the text part at the beginning of your file, e.g. when opening in a decent editor like notepad++.
PFA QVD XML
You would need to extract that for the two original QVDs and the final, then compare the meta data for the fields.
For example, have a look at field #ACT_SALES_AMT_RATE_A
<QvdFieldHeader>
<FieldName>#ACT_SALES_AMT_RATE_A</FieldName>
<BitOffset>96</BitOffset>
<BitWidth>16</BitWidth>
<Bias>-2</Bias>
<NumberFormat>
<Type>UNKNOWN</Type>
<nDec>0</nDec>
<UseThou>0</UseThou>
<Fmt></Fmt>
<Dec></Dec>
<Thou></Thou>
</NumberFormat>
<NoOfSymbols>36088</NoOfSymbols>
<Offset>113581</Offset>
<Length>324788</Length>
<Comment></Comment>
<Tags>
<String>$numeric</String>
</Tags>
</QvdFieldHeader>
You see that there are a lot of distinct symbols, and QlikView uses 16 bits to represent.
But in one of your input table, this field is missing or constant, so QV does not need to use any bits for this field in the fact table - but only until you concatenate the table with the 16 bit symbol pointer.
If your first table has a lot of records and some more fields like this, this could explain the unexpected increase in size.
(and using the meta information of all three tables, you can actually double check this by calculating the sizes manually).