Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
diwakarnahata
Creator
Creator

Concatenation from QVDs taking unexpected space

pcammaert

gwassenaar;

Hello All,

I am concatenating TWO identical tables from two different QVDs into a QVW in compression mode - None.

The QVD sizes are 1.63 GB and 0.03 GB while the QVW size after concatenation is 2.6GB.

Ideally the QVW should be around 1.66GB, Any idea where is the extra ~1GB getting blocked in the QVW?

As a Proof of Concept, i created a dummy QVD of 1.6MB, and loaded the same twice. In the second time, i renamed 2 columns, so the final QVW size became 5.2 MB instead of 3.2MB, which shows that the nulls are taking a lot of space. Any suggestions how we can save this space, and what impact will it have on performance?

Regards,

Diwakar

7 Replies
ciaran_mcgowan
Partner - Creator III
Partner - Creator III

If you store the resulting table to another QVD, what size is it?

You could try disabling data lineage in Qlikview's hidden menu?

Help > About Qlikview > (right click on QV logo) > Set AllowDataLineage = -1

Gysbert_Wassenaar

Try opening the Qlikview document without data, then reload the document and finally save the reloaded document again.


talk is cheap, supply exceeds demand
avinashelite

After loading the data are you creating any chart and joins in the data model...

diwakarnahata
Creator
Creator
Author

Hi All,

Please find below the result of the tests i did on the sample data (QVD Attached):

1. The Dummy QVD of 1.6MB when loaded twice / concatenated (with two columns aliased with a different name) resulted in a QVW of 5.1MB and the single QVD of the concatenated table is of 5MB.

2. I also disabled the data lineage and re-executed the script, still the size is 5.1 MB

3. I also opened the App without data and reloaded, still getting the same result.

@Avinash: No, i am not creating any charts in the QVW, its just data.

I am attaching the sample QVD and QVW which i am loading twice in an empty QVW using the below script.

PLEASE NOTE: I am using QlikView12 SR1 in Desktop.

Table1:

LOAD col1,

    col2,

    col3,

    col4,

    col5

FROM

Table1.qvd

(qvd);


Concatenate(Table1)

LOAD col1,

    col2,

    col3,

    col4 as col6,

    col5 as col7

FROM

Table1.qvd

(qvd);

store Table1 into ConcatTable1.qvd(qvd);

Let me know how this could be resolved.

Regards,

Diwakar

kaushiknsolanki
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

Its because the Data size is calculated for individual columns.

Lets Assume that you are loading only 1 column and it contains 0.2 million records with 1080 distinct values.

And the approx size is 600 KB.

Now if you add same column and load it twice, then you will add some more size to your application.

So like wise when you keep on adding the columns the data size will keep increasing, because the data size depends on individual columns.

And thus in your case when you concatenate  the QVD with itself you see the size is doubled, but when you rename couple of columns, which increased over all column count and thus you see that size is more than double.

Hope its clear to you.

Regards,

Kaushik Solanki

Please remember to hit the 'Like' button and for helpful answers and resolutions, click on the 'Accept As Solution' button. Cheers!
diwakarnahata
Creator
Creator
Author

Hi Kaushik,

Thank you for the information.

So, does it mean that nulls take considerable space here, and QlikView is not able to use effective compression? Because the data is still the same in both cases, just that 4 columns have 50% nulls post concatenation.

Also, the increase in size is huge from 3.2MB to 5.1MB (~60%) when 2 out of 5 columns (~40%) are renamed.

Based on the above results, can we conclude that if we concatenate multiple fact tables, the more the column names are different across tables, the poor the performance will get.


This is a very critical for our application design as we have around 4 fact tables, each of which have a few extra metrics. The biggest fact has ~200 million rows.


Regards,

Diwakar


swuehl
MVP
MVP

I suggest that you take a look at the memory statistics for your two QVW (the one with concatenate fact table and the one with renamed fields).

Compare the size of your data tables and your symbol tables of all fields, note that the size of the symbol tables can make a big difference (and the number of distinct values does matter):

Symbol Tables and Bit-Stuffed Pointers

You can create the memory statistics using the settings dialog, or use Rob Wunderlich's document analyzer.