<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: QVD bigger that CSV in QlikView</title>
    <link>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68337#M778857</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello Jose Miguel, it's pretty normal than the qvd uses more space, it's optimized for speed on loads, not to save space, maybe it's because it also has a header and the symbol tables:&lt;/P&gt;&lt;P&gt;&lt;A href="https://help.qlik.com/en-US/sense/February2018/Subsystems/Hub/Content/Scripting/work-with-QVD-files.htm" title="https://help.qlik.com/en-US/sense/February2018/Subsystems/Hub/Content/Scripting/work-with-QVD-files.htm"&gt;https://help.qlik.com/en-US/sense/February2018/Subsystems/Hub/Content/Scripting/work-with-QVD-files.htm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Performance tips are more for the qvw than the qvd.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Fri, 06 Apr 2018 11:22:21 GMT</pubDate>
    <dc:creator>rubenmarin</dc:creator>
    <dc:date>2018-04-06T11:22:21Z</dc:date>
    <item>
      <title>QVD bigger than CSV</title>
      <link>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68336#M778856</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Today I was doing a test, and I loaded this csv file (&lt;A href="https://www.freemaptools.com/download/full-postcodes/ukpostcodes.zip" title="https://www.freemaptools.com/download/full-postcodes/ukpostcodes.zip" target="_blank"&gt;https://www.freemaptools.com/download/full-postcodes/ukpostcodes.zip&lt;/A&gt;) and converted to QVD with Qlik Sense.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The surprise was that the QVD file was bigger than the CSV file.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Zip file (with csv):&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 32,169 KB&lt;/P&gt;&lt;P&gt;CSV file:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 94,192 KB&lt;/P&gt;&lt;P&gt;QVD file:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 121,277 KB&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is around 30% bigger!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This file has four fields: id, latitude, longitude and postal code. I think, the "problem" is in the latitude and longitude fields, that they are super acurate (15 decimal digits), because the id is an incremental integer (is a little space in QV) and the postal code is a nine char string (with a lot of similar blocks)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To increase the performance I have some suggestions:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;What's the number of decimal digits in the coordinates needed for a postal code?&lt;/LI&gt;&lt;LI&gt;Can I divide the postal code in two four char strings? It has sense?&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 25 Nov 2020 16:16:04 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68336#M778856</guid>
      <dc:creator>jmvilaplanap</dc:creator>
      <dc:date>2020-11-25T16:16:04Z</dc:date>
    </item>
    <item>
      <title>Re: QVD bigger that CSV</title>
      <link>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68337#M778857</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello Jose Miguel, it's pretty normal than the qvd uses more space, it's optimized for speed on loads, not to save space, maybe it's because it also has a header and the symbol tables:&lt;/P&gt;&lt;P&gt;&lt;A href="https://help.qlik.com/en-US/sense/February2018/Subsystems/Hub/Content/Scripting/work-with-QVD-files.htm" title="https://help.qlik.com/en-US/sense/February2018/Subsystems/Hub/Content/Scripting/work-with-QVD-files.htm"&gt;https://help.qlik.com/en-US/sense/February2018/Subsystems/Hub/Content/Scripting/work-with-QVD-files.htm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Performance tips are more for the qvw than the qvd.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 06 Apr 2018 11:22:21 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68337#M778857</guid>
      <dc:creator>rubenmarin</dc:creator>
      <dc:date>2018-04-06T11:22:21Z</dc:date>
    </item>
    <item>
      <title>Re: QVD bigger that CSV</title>
      <link>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68338#M778858</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I'm not surprised the QVD is larger in this case.&amp;nbsp; A CSV file of locations would likely have no repeating field values. It's likely every value is unique. If that's the case, the smallest storage format would be pure text like a CSV. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;A QVD on the other hand contains lists of values (Symbol tables), pointers and metadata.&amp;nbsp; When there are repeating field values, the "de-duplication" process will result in a QVD that takes less disk space than the source data. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;More information here:&lt;/P&gt;&lt;P&gt;&lt;A href="http://qlikviewcookbook.com/2011/03/document-compression/" title="http://qlikviewcookbook.com/2011/03/document-compression/"&gt;Document Compression | Qlikview Cookbook&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;-Rob&lt;/P&gt;&lt;P&gt;&lt;A class="jive-link-external-small" href="http://masterssummit.com" rel="nofollow" target="_blank"&gt;http://masterssummit.com&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A class="jive-link-external-small" href="http://qlikviewcookbook.com" rel="nofollow" target="_blank"&gt;http://qlikviewcookbook.com&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 06 Apr 2018 11:24:39 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68338#M778858</guid>
      <dc:creator>rwunderlich</dc:creator>
      <dc:date>2018-04-06T11:24:39Z</dc:date>
    </item>
    <item>
      <title>Re: QVD bigger that CSV</title>
      <link>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68339#M778859</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks Rob,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But finally this space will be at the QVW file (and in RAM) and this is a small part of data (only the coordinates for a map representation)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The accuracy of coordinates was my initial idea for decrease the memory (maybe I don´t need 15 decimal digits for the representation in a map).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Finally I think I´ll do a join to discard all the unused postal codes.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks again&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 06 Apr 2018 11:34:25 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68339#M778859</guid>
      <dc:creator>jmvilaplanap</dc:creator>
      <dc:date>2018-04-06T11:34:25Z</dc:date>
    </item>
    <item>
      <title>Re: QVD bigger that CSV</title>
      <link>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68340#M778860</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P style="font-size: 13.3333px;"&gt;Thanks Ruben,&lt;/P&gt;&lt;P style="font-size: 13.3333px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333px;"&gt;But finally this space will be at the QVW file (and in RAM) and this is a small part of data (only the coordinates for a map representation)&lt;/P&gt;&lt;P style="font-size: 13.3333px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333px;"&gt;The accuracy of coordinates was my initial idea for decrease the memory (maybe I don´t need 15 decimal digits for the representation in a map).&lt;/P&gt;&lt;P style="font-size: 13.3333px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333px;"&gt;Finally I think I´ll do a join to discard all the unused postal codes.&lt;/P&gt;&lt;P style="font-size: 13.3333px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333px;"&gt;Thanks again&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 06 Apr 2018 11:34:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68340#M778860</guid>
      <dc:creator>jmvilaplanap</dc:creator>
      <dc:date>2018-04-06T11:34:45Z</dc:date>
    </item>
    <item>
      <title>Re: QVD bigger that CSV</title>
      <link>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68341#M778861</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Rob is right that with mostly distinct values you won't save storage space or RAM but with a few transformations you could reduce the size significantely, for example with:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;test:&lt;/P&gt;&lt;P&gt;LOAD floor(id) as id, &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; subfield(postcode, ' ', 1) as postcode1,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; subfield(postcode, ' ', 2) as postcode2,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; floor(subfield(latitude, '.', 1)) as latitude1, &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; floor(left(subfield(latitude, '.', 2), 14)) as latitude2,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; // this kind of cutting will cause a small inaccuracy but I think it's not essential&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; floor(subfield(longitude, '.', 1)) as longitude1, &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; floor(left(subfield(longitude, '.', 2), 14)) as longitude2&lt;/P&gt;&lt;P&gt;FROM [ukpostcodes.csv] (txt, codepage is 1252, embedded labels, delimiter is ',');&lt;/P&gt;&lt;P&gt;store test into test.qvd (qvd);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;which will reduce it to 58,003 and if you removed the id (any use ? - the association with the datamodel should be the postcode, right? - whereby by loading the data into a datamodel the complete postcode would be needed but it could be then an autonumber() field) it would be just reduced to 45,955.&lt;/P&gt;&lt;P&gt;And if you load them there with an exists() to the real existing postcodes you should need even much lesser RAM as this number.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But if you are not on the limits of your environment it might be not worth to make these additionally efforts ...&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;- Marcus&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 06 Apr 2018 12:07:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/QVD-bigger-than-CSV/m-p/68341#M778861</guid>
      <dc:creator>marcus_sommer</dc:creator>
      <dc:date>2018-04-06T12:07:37Z</dc:date>
    </item>
  </channel>
</rss>

