<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Optimizing by splitting tables with null value fields? in QlikView</title>
    <link>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336326#M821902</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;John,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;you're welcome! Nice to have some appreciation on here!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Marcus&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Thu, 03 Aug 2017 08:10:04 GMT</pubDate>
    <dc:creator>marcus_malinow</dc:creator>
    <dc:date>2017-08-03T08:10:04Z</dc:date>
    <item>
      <title>Optimizing by splitting tables with null value fields?</title>
      <link>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336321#M821897</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have a large facts table, with around 100 fields and 50+ million records (along with many dimensions tables linked to it). The problem is that many of the fields within the facts table contain a large percentage of null values. In an effort to optimize the data model, I'm wondering if taking those fields containing mostly nulls and splitting them off into a separate table would help. The example below will help illustrate:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="text-decoration: underline;"&gt;Field Name&lt;/SPAN&gt;:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;SPAN style="text-decoration: underline;"&gt;Information Density:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;patientID,&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 100%&lt;/P&gt;&lt;P&gt;primarydiag,&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 100%&lt;/P&gt;&lt;P&gt;otherdiag1,&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 22%&lt;/P&gt;&lt;P&gt;otherdiag2,&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 15%&lt;/P&gt;&lt;P&gt;otherdiag3,&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8%&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;otherdiag30,&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1%&lt;/P&gt;&lt;P&gt;[other fields]&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 90%-100%&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In this example, the fields 'patientID' and 'primarydiag' contain no null values, whereas the fields 'otherdiag1' through 'otherdiag30' contain a large portion. (This is hospital data - Every patient admitted has a primary diagnosis, and some patients have 1 and up to 30 other diagnoses).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My idea is to split off the fields otherdiag1-otherdiag30 into their own table (linked to the main facts table through the patientID field)-- I'll call it the "OtherDiag" table -- and then reduce that table by eliminating the null values within the 'otherdiag1' field (because that field contains the fewest nulls of the otherdiag1-30 fields. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My questions are: &lt;/P&gt;&lt;P&gt;1) would this be a good way of optimizing my data model (and reducing the overall size of my app file)?&amp;nbsp; And if it is, then,&lt;/P&gt;&lt;P&gt;2) what script would I use to create the second table ("OtherDiag")? Would the script below be correct (I'm guessing it would involve the use of some either "where exists" or "Where not is null")?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;???&lt;/P&gt;&lt;P&gt;OtherDiag:&lt;/P&gt;&lt;P&gt;LOAD&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; patientID,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; otherdiag1,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; otherdiag2,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; otherdiag30&lt;/P&gt;&lt;P&gt;RESIDENT FactsTable&amp;nbsp; Where Not IsNull(otherdiag1);&amp;nbsp; ????&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;...or something like this?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks in advance for any suggestions.&lt;/P&gt;&lt;P&gt;-John&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 25 Nov 2020 16:16:04 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336321#M821897</guid>
      <dc:creator>jchambers123</dc:creator>
      <dc:date>2020-11-25T16:16:04Z</dc:date>
    </item>
    <item>
      <title>Re: Optimizing by splitting tables with null value fields?</title>
      <link>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336322#M821898</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;John,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;That may be useful, though given that you're not going to reduce the overall number of values &amp;amp; pointers in each field I doubt your data model will reduce in size.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Had you considered a slightly different structure:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;patientId, otherdiagNumber, otherdiag&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;for example&lt;/P&gt;&lt;P&gt;LOAD&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; patientID,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1 as otherdiagNumber,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; otherdiag1 as otherdiag&lt;/P&gt;&lt;P&gt;RESIDENT FactsTable &lt;/P&gt;&lt;P&gt;WHERE not isnull(otherdiag1)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;CONCATENATE&lt;/P&gt;&lt;P&gt;LOAD&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; patientID,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2 as otherdiagNumber,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; otherdiag2 as otherdiag&lt;/P&gt;&lt;P&gt;RESIDENT FactsTable &lt;/P&gt;&lt;P&gt;WHERE not isnull(otherdiag2)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;etc..etc...&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;with a loop this would be:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;for n = 1 to 30&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; otherdiag:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; LOAD&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; patientID,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; $(n) as otherdiagNumber,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; otherdiag$(n) as otherdiag&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; RESIDENT FactsTable &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE not isnull(otherdiag$(n))&lt;/P&gt;&lt;P&gt;next n&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In fact, if this were my data I might go further than this and include the primary diagnosis also...&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 02 Aug 2017 14:03:29 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336322#M821898</guid>
      <dc:creator>marcus_malinow</dc:creator>
      <dc:date>2017-08-02T14:03:29Z</dc:date>
    </item>
    <item>
      <title>Re: Optimizing by splitting tables with null value fields?</title>
      <link>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336323#M821899</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I asked a similar question time ago, and the conclusion was that null don't use memory, only the diferent values from a field.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Check it here:&lt;A href="https://community.qlik.com/thread/266867"&gt;How much memory uses a null value in a table?&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 02 Aug 2017 14:03:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336323#M821899</guid>
      <dc:creator>jmvilaplanap</dc:creator>
      <dc:date>2017-08-02T14:03:37Z</dc:date>
    </item>
    <item>
      <title>Re: Optimizing by splitting tables with null value fields?</title>
      <link>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336324#M821900</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Like Marcus and Jose I don't believe that you would optimize much your application by splitting the fact-table into several tables from a RAM and UI performance point of view.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But the UI handling from the otherdiag-fields would be easier if you merged them into a single-field by loading this part with a crosstable prefix and removing the NULL within a following where-clause. After them you could use a set analysis within your expressions to calculate only those otherdiag-values which you want.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;- Marcus&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 02 Aug 2017 15:41:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336324#M821900</guid>
      <dc:creator>marcus_sommer</dc:creator>
      <dc:date>2017-08-02T15:41:37Z</dc:date>
    </item>
    <item>
      <title>Re: Optimizing by splitting tables with null value fields?</title>
      <link>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336325#M821901</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Marcus, Jose, and Marcus,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I appreciate the three of you taking the time to reply. I was unaware that null values consume very little memory, which prompted me to do some additional reading on Qlikviews data types (I found this useful article: &lt;A href="http://qlikviewcookbook.com/2008/05/memory-sizes-for-data-types/" title="http://qlikviewcookbook.com/2008/05/memory-sizes-for-data-types/"&gt;Memory sizes for data types | Qlikview Cookbook&lt;/A&gt;). &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Nonetheless, Marcus' suggestion to concatenate the separate fields into a single one using a looping function was a clever and elegant solution (it actually solves some other issues I had with them separated) -- much appreciated, Marcus.&lt;/P&gt;&lt;P&gt;Thanks again all,&lt;/P&gt;&lt;P&gt;-John&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 02 Aug 2017 19:36:39 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336325#M821901</guid>
      <dc:creator>jchambers123</dc:creator>
      <dc:date>2017-08-02T19:36:39Z</dc:date>
    </item>
    <item>
      <title>Re: Optimizing by splitting tables with null value fields?</title>
      <link>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336326#M821902</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;John,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;you're welcome! Nice to have some appreciation on here!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Marcus&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 03 Aug 2017 08:10:04 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Optimizing-by-splitting-tables-with-null-value-fields/m-p/1336326#M821902</guid>
      <dc:creator>marcus_malinow</dc:creator>
      <dc:date>2017-08-03T08:10:04Z</dc:date>
    </item>
  </channel>
</rss>

