Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

QVD size doubling when concatenating 2 identical fact tables

Dear Friends,

I have 2 Fact tables with equal number of columns coming from 2 different subject areas. I have concatenated both tables and storing the concatenated table as QVD. I observed that size of the QVD is almost 3 times of individual QVD sizes of those 2 facts.

Fact1 QVD Size if I store individually: 32 MB( record count 1.3 million)

Fact 2 QVD Size if I store individually: 168 MB ( 9 Million)

After I concatenated 2 tables , Combined QVD size is 486 MB.

Please suggest how I can reduce the size.

Br,

Chinna

21 Replies
swuehl
MVP
MVP

Could you post the meta data (XML-part) of your three QVDs?

If your field values are quite different in both QVDs (so it needs more bits to represent the distinct values in the facts and maybe as well more bytes to represent the symbols, like for mixed number / text representation), this may explain some of your size increase.

kuba_michalik
Partner - Specialist
Partner - Specialist

Lots of %IDs there. Any chance they were created with Autonumber during the QVD creation steps? Because if so, concatenating the tables will break consecutive integer optimization. Maybe

swuehl
MVP
MVP

In addition, you are creating some of the fields in your load with a constant value.

Are you comparing your numbers using the original QVD sizes or after transformation, but not concatenated?

Also note that the constant field value optimization (QV does not need to store a bit pointer for a constant field value) doesn't kick in if you are concatenating the table with records with distinct values. Since you are using quite a lot of these fields in either table, this might explain very well the increase in fact table size.

Anonymous
Not applicable
Author

No,we haven't created the %IDs using autonumber. Those are direct key values from dimensions.

However, These QVDs are not straight forward ones. These are transformed QVDs.

without constant values also it is giving same size. I have put the constant values just to enforce auto concatenate .

Anonymous
Not applicable
Author

Dear Stefan,

Thanks for your time.

How to extract the XML part??

Br,

Chinna

Gysbert_Wassenaar

Open the file with a text editor. And copy the xml part. Basically the part that's readable.


talk is cheap, supply exceeds demand
kuba_michalik
Partner - Specialist
Partner - Specialist

I think the key point about the constants is that you are not only concatenating the tables, you are also making each of them wider (by adding more fields). Especially the second, larger table, where you add over 10 columns. This widening has its own cost in space, no matter if you fill the added columns with zeros, empty strings, nulls (which Concatenate would add automatically if you did not assign them manually), or whatever else - at least that's how I understood Stefan's remark.

You could not test it by adding the columns to each QVD and storing the results separately, because QlikView would wise up to the fact all values in the column are for example 0 and use nearly no space because of that. After you actually concatenate the tables, it's no longer the case (QlikView cannot use this single value in column compression trick anymore) and the size grows.

swuehl
MVP
MVP

The part is the text part at the beginning of your file, e.g. when opening in a decent editor like notepad++.

Anonymous
Not applicable
Author

PFA QVD XML

swuehl
MVP
MVP

You would need to extract that for the two original QVDs and the final, then compare the meta data for the fields.

For example, have a look at field #ACT_SALES_AMT_RATE_A

  <QvdFieldHeader>

       <FieldName>#ACT_SALES_AMT_RATE_A</FieldName>

       <BitOffset>96</BitOffset>

       <BitWidth>16</BitWidth>

       <Bias>-2</Bias>

       <NumberFormat>

         <Type>UNKNOWN</Type>

         <nDec>0</nDec>

         <UseThou>0</UseThou>

         <Fmt></Fmt>

         <Dec></Dec>

         <Thou></Thou>

       </NumberFormat>

      <NoOfSymbols>36088</NoOfSymbols>

       <Offset>113581</Offset>

       <Length>324788</Length>

       <Comment></Comment>

       <Tags>

         <String>$numeric</String>

       </Tags>

     </QvdFieldHeader>

You see that there are a lot of distinct symbols, and QlikView uses 16 bits to represent.

But in one of your input table, this field is missing or constant, so QV does not need to use any bits for this field in the fact table - but only until you concatenate the table with the 16 bit symbol pointer.

If your first table has a lot of records and some more fields like this, this could explain the unexpected increase in size.

(and using the meta information of all three tables, you can actually double check this by calculating the sizes manually).