<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Perf problem with data discovery in QlikView</title>
    <link>https://community.qlik.com/t5/QlikView/Perf-problem-with-data-discovery/m-p/1349338#M438133</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Nicolas, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I tried using direct discovery a few years back (on 11.2).&amp;nbsp; I found so many limitations that it was basically unusable in all but the most basic of scenarios.&amp;nbsp; It's a real shame, as it would be an exceptionally useful capability to have (had it been implemented properly).&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I can't remember the specifics (sorry, it was quite a few years ago now), but I found an enormous overhead similar to what you're experiencing in many cases.&amp;nbsp; I ran some traces at the time and I believe QV was running some "select distinct" queries on all the dimension fields on the script reload (which due to the underlying queries we had this was very inefficient for the specific use case I was prototyping).&amp;nbsp;&amp;nbsp; From memory, I think it would run a select distinct using the entire query core, but then just using a select distinct for every single field marked as a dimension.&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So if you have &lt;/P&gt;&lt;P&gt;select &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dima&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ,dimb&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ,dimc&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ,metric&lt;/P&gt;&lt;P&gt;from&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; [some really complex query]&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;qv would run 3 queries at reload time&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;select distinct dima &lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;from&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; [some really complex query]&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;select distinct dimb &lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;from&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; [some really complex query]&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;select distinct dimc &lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;from&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; [some really complex query]&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;There was no way to tell QV to avoid the complex query and just pick up the distinct list of dimension values from another more performant table/query.&amp;nbsp; This is just one example of the types of things I found.&amp;nbsp; Not specifically related to your problem by the sound of it, but just for background.&lt;/SPAN&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;For your specific problems, I think your best bet would be to run some DB traces to understand where the issue is - you might be able to work around it depending on the specifics of your environment and use case, but expect it to be an uphill journey.&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;Regards,&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;Graeme&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Fri, 21 Jul 2017 10:35:01 GMT</pubDate>
    <dc:creator />
    <dc:date>2017-07-21T10:35:01Z</dc:date>
    <item>
      <title>Perf problem with data discovery</title>
      <link>https://community.qlik.com/t5/QlikView/Perf-problem-with-data-discovery/m-p/1349337#M438132</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have a VERY huge table stored in a Vertica instance (really fast for SELECT queries).&lt;/P&gt;&lt;P&gt;Basicly, I have the following fields:&lt;/P&gt;&lt;P&gt;- date&lt;/P&gt;&lt;P&gt;- store&lt;/P&gt;&lt;P&gt;- product&lt;/P&gt;&lt;P&gt;- price&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Where I have for each store for each date for each product, a price.&lt;/P&gt;&lt;P&gt;This table is approximatively 150 GB.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I've loaded this table in QV (12.1) with a script like &lt;/P&gt;&lt;PRE __default_attr="plain" __jive_macro_name="code" class="jive_macro_code jive_text_macro _jivemacro_uid_15006208253296321" jivemacro_uid="_15006208253296321" modifiedtitle="true"&gt;
&lt;P&gt;my_table:&lt;/P&gt;
&lt;P&gt;DIRECT QUERY&lt;/P&gt;
&lt;P&gt;DIMENSION&lt;/P&gt;
&lt;P&gt;year,&lt;/P&gt;
&lt;P&gt;date,&lt;/P&gt;
&lt;P&gt;store,&lt;/P&gt;
&lt;P&gt;product&lt;/P&gt;
&lt;P&gt;MEASURE&lt;/P&gt;
&lt;P&gt;price&lt;/P&gt;
&lt;P&gt;FROM my_table;&lt;/P&gt;
&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In QlikView, I've made a chart with dimension "Store" and "Product", and expression:&lt;/P&gt;&lt;PRE __default_attr="plain" __jive_macro_name="code" class="jive_macro_code _jivemacro_uid_15006211104245420 jive_text_macro" jivemacro_uid="_15006211104245420" modifiedtitle="true"&gt;
&lt;P&gt;avg(price)&lt;/P&gt;
&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If I filter on "year = 2017", this charts takes minutess to compute.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In another chart, with the expression:&lt;/P&gt;&lt;PRE __default_attr="plain" __jive_macro_name="code" class="jive_macro_code _jivemacro_uid_15006212834957598 jive_text_macro" jivemacro_uid="_15006212834957598" modifiedtitle="true"&gt;
&lt;P&gt;&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;avg(&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;aggr(&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;if(product = 'My Favotite', avg(price))&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;, Store, Product)&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;)&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;to view only the prices of "My Favorite" product (something I would have done with a Set Analysis).&lt;/P&gt;&lt;P&gt;This chart also takes minutes (maybe 10) to compute.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In a query tool, when I submit the following query:&lt;/P&gt;&lt;PRE __default_attr="plain" __jive_macro_name="code" class="jive_macro_code jive_text_macro _jivemacro_uid_15006230132591600" jivemacro_uid="_15006230132591600" modifiedtitle="true"&gt;
&lt;P&gt;SELECT&lt;/P&gt;
&lt;P&gt;store,&lt;/P&gt;
&lt;P&gt;product,&lt;/P&gt;
&lt;P&gt;avg(price)&lt;/P&gt;
&lt;P&gt;FROM my_table&lt;/P&gt;
&lt;P&gt;WHERE year = 2017&lt;/P&gt;
&lt;P&gt;GROUP BY store, product;&lt;/P&gt;
&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;it takes 2 seconds to give me the result.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And&lt;/P&gt;&lt;PRE __default_attr="plain" __jive_macro_name="code" class="jive_macro_code jive_text_macro _jivemacro_uid_1500624867110671" jivemacro_uid="_1500624867110671" modifiedtitle="true"&gt;
&lt;P&gt;SELECT&lt;/P&gt;
&lt;P&gt;store,&lt;/P&gt;
&lt;P&gt;product,&lt;/P&gt;
&lt;P&gt;avg(price)&lt;/P&gt;
&lt;P&gt;FROM my_table&lt;/P&gt;
&lt;P&gt;WHERE year = 2017&lt;/P&gt;
&lt;P&gt;AND product = 'My Favorite'&lt;/P&gt;
&lt;P&gt;GROUP BY store, product;&lt;/P&gt;
&lt;/PRE&gt;&lt;P&gt;less than 1 second.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I would like to understand why there is such a difference.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I didn't find where to set the parameter that creates a SQL.log and view the exact query.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 21 Jul 2017 08:24:20 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Perf-problem-with-data-discovery/m-p/1349337#M438132</guid>
      <dc:creator>nicolas_martin</dc:creator>
      <dc:date>2017-07-21T08:24:20Z</dc:date>
    </item>
    <item>
      <title>Re: Perf problem with data discovery</title>
      <link>https://community.qlik.com/t5/QlikView/Perf-problem-with-data-discovery/m-p/1349338#M438133</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Nicolas, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I tried using direct discovery a few years back (on 11.2).&amp;nbsp; I found so many limitations that it was basically unusable in all but the most basic of scenarios.&amp;nbsp; It's a real shame, as it would be an exceptionally useful capability to have (had it been implemented properly).&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I can't remember the specifics (sorry, it was quite a few years ago now), but I found an enormous overhead similar to what you're experiencing in many cases.&amp;nbsp; I ran some traces at the time and I believe QV was running some "select distinct" queries on all the dimension fields on the script reload (which due to the underlying queries we had this was very inefficient for the specific use case I was prototyping).&amp;nbsp;&amp;nbsp; From memory, I think it would run a select distinct using the entire query core, but then just using a select distinct for every single field marked as a dimension.&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So if you have &lt;/P&gt;&lt;P&gt;select &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dima&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ,dimb&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ,dimc&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ,metric&lt;/P&gt;&lt;P&gt;from&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; [some really complex query]&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;qv would run 3 queries at reload time&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;select distinct dima &lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;from&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; [some really complex query]&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;select distinct dimb &lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;from&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; [some really complex query]&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;select distinct dimc &lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;from&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; [some really complex query]&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;SPAN style="font-size: 13.3333330154419px;"&gt;There was no way to tell QV to avoid the complex query and just pick up the distinct list of dimension values from another more performant table/query.&amp;nbsp; This is just one example of the types of things I found.&amp;nbsp; Not specifically related to your problem by the sound of it, but just for background.&lt;/SPAN&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;For your specific problems, I think your best bet would be to run some DB traces to understand where the issue is - you might be able to work around it depending on the specifics of your environment and use case, but expect it to be an uphill journey.&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;Regards,&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333330154419px;"&gt;Graeme&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 21 Jul 2017 10:35:01 GMT</pubDate>
      <guid>https://community.qlik.com/t5/QlikView/Perf-problem-with-data-discovery/m-p/1349338#M438133</guid>
      <dc:creator />
      <dc:date>2017-07-21T10:35:01Z</dc:date>
    </item>
  </channel>
</rss>

