<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Best practices for performance? in App Development</title>
    <link>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1592340#M43524</link>
    <description>&lt;P&gt;"&lt;SPAN&gt;&amp;nbsp;I believe this means that in my case, those two formulas should always be equivalent, no?"&amp;nbsp; Yes, I would agree.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"Actually, at this point, I'm leaning toward making a new field in the load script instead since I don't see why everything can't be precalculated except for the aggregation"&amp;nbsp; I think this is the best approach for performance and consistency.&amp;nbsp; I would also consider rounding the new values to the same precision as your accounting system, usually .01.&amp;nbsp; That will reduce the overall number of values that Qlik has to store and yield consistent results with any slice of data.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-Rob&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 16 Jun 2019 22:07:46 GMT</pubDate>
    <dc:creator>rwunderlich</dc:creator>
    <dc:date>2019-06-16T22:07:46Z</dc:date>
    <item>
      <title>Best practices for performance?</title>
      <link>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1591865#M43484</link>
      <description>&lt;P&gt;I am wondering which scenario has the better performance in each of the pairs below.&amp;nbsp; The metric I am most interested in is visualization rendering time, both initially and in response to user selections.&amp;nbsp; I'm less concerned with memory consumption or load-script duration.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Measure = Sum(FieldA) * 1000  // 1. Aggregate first
Measure = Sum(FieldA * 1000)  // 2. Aggregate last

Measure = Sum(FieldA + FieldB)       // 1. Aggregate last
Measure = Sum(FieldA) + Sum(FieldB)  // 2. Aggregate first

NewField = FieldA * 1000  // 1. Make another field while loading data
Variable = FieldA * 1000  // 2. Make a variable instead using "Set var = ...;"

NewField = FieldA + FieldB  // 1. Make another field while loading data
Variable = FieldA + FieldB  // 2. Make a variable instead using "Set var = ...;"&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In each case, I put the option that I think should result in more responsive visualizations first, but I don't have the empirical data or knowledge of what's under the hood to back it up.&amp;nbsp; I don't know 1) how the data is physically arranged in memory, 2) whether Qlik executes formulas as is or rearranges them for efficiency like SQL query optimizers, or 3) whether things like&amp;nbsp;caching or preloading come into play.&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jun 2019 05:19:01 GMT</pubDate>
      <guid>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1591865#M43484</guid>
      <dc:creator>mmarchese</dc:creator>
      <dc:date>2019-06-14T05:19:01Z</dc:date>
    </item>
    <item>
      <title>Re: Best practices for performance?</title>
      <link>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1591874#M43485</link>
      <description>&lt;P&gt;Only the first pair produce the same result. I think it would be very hard to detect any difference in render time for these two, even for large datasets, although i would expect (1) to be a marginally faster.&lt;/P&gt;&lt;P&gt;The second pair will produce different results if there are any null values in one field and non-null values in the other, as adding anything with a null produces a null.&lt;/P&gt;&lt;P&gt;I don't understand how you would like to use the variables in the second two examples. Loading a field (a set of values) and a variable (a scalar)&amp;nbsp; are certainly not equivalent.&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jun 2019 05:38:43 GMT</pubDate>
      <guid>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1591874#M43485</guid>
      <dc:creator>jonathandienst</dc:creator>
      <dc:date>2019-06-14T05:38:43Z</dc:date>
    </item>
    <item>
      <title>Re: Best practices for performance?</title>
      <link>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1591887#M43488</link>
      <description>&lt;P&gt;What is your conclusion that they'd have similar performance based on?&amp;nbsp; I would think that given many rows of data, the choice is between [adding millions of times] and [multiplying once] versus [adding millions of times] and [multiplying millions of times].&amp;nbsp; Shouldn't the latter be at least twice as slow since it's twice as many operations?&amp;nbsp; Or is there something I am not considering?&lt;/P&gt;&lt;P&gt;Assume all pairs produce the same result since the question would be pointless otherwise (so disregard situations with nulls).&lt;/P&gt;&lt;P&gt;Sorry if my terminology was not clear.&amp;nbsp; When I say a field, I mean a column that comes right from a table or a calculated column that is derived directly from tables, rather than requiring aggregation.&amp;nbsp; When I say "a variable made with Set," I'm talking about a text macro, not a scalar.&amp;nbsp; The two options I was getting at are 1) adding two columns together at load time to create a third column and then referencing that new column in your dimensions and measures, or 2) creating a text-macro variable that consists of a formula adding those two columns together and then referencing that variable in your dimensions and measures instead.&amp;nbsp; Intuitively, option 1 strikes me as much faster (since the work has already been done) at the cost of more memory.&amp;nbsp; Here's a more detailed example:&lt;/P&gt;&lt;P&gt;Option 1:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;// load script
LOAD
  FieldA,
  FieldB,
  FieldC = FieldA + FieldB
...

// worksheet
Dimension = FieldC
Measure = Sum(FieldC)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Option 2:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;// load script
LOAD
  FieldA,
  FieldB
...

Set VariableC = FieldA + FieldB;

// worksheet
Dimension = $(VariableC)
Measure = Sum($(VariableC))&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jun 2019 06:38:13 GMT</pubDate>
      <guid>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1591887#M43488</guid>
      <dc:creator>mmarchese</dc:creator>
      <dc:date>2019-06-14T06:38:13Z</dc:date>
    </item>
    <item>
      <title>Re: Best practices for performance?</title>
      <link>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1592299#M43520</link>
      <description>&lt;P&gt;Let's deal with the 3rd &amp;amp; 4th examples first.&amp;nbsp; Precalculating in the script, when possible, will always yield faster response time in the front end. Yes, there will be additional memory in the data model but it's likely to be trivial.&amp;nbsp; So creating a new field in the script is almost always the better performance choice.&amp;nbsp;&lt;/P&gt;&lt;P&gt;With regards to the question of multiplying by the total sum vs multiplying each individual value. Theoretically there are fewer instructions executed when doing a single multiplication by the entire sum().&amp;nbsp;&lt;/P&gt;&lt;P&gt;"&lt;SPAN&gt;Shouldn't the latter be at least twice as slow since it's twice as many operations?&amp;nbsp; Or is there something I am not considering?" There is a lot more going on (such as marshalling the data) in calculating a measure beyond the specific calc we see, so "twice as much" only applies to a portion of the overall calc time. I ran some timing tests on your examples with 7M rows and I agree with&amp;nbsp;&lt;a href="https://community.qlik.com/t5/user/viewprofilepage/user-id/14931"&gt;@jonathandienst&lt;/a&gt;&amp;nbsp;that the differences are trivial -- at most ~5% better.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Importantly as&amp;nbsp;&lt;a href="https://community.qlik.com/t5/user/viewprofilepage/user-id/14931"&gt;@jonathandienst&lt;/a&gt;&amp;nbsp;pointed out, the difference expressions may return different results.&amp;nbsp; We can't always ignore that.&amp;nbsp; For large numbers,&amp;nbsp; &amp;nbsp;"Sum(x * 1000)" and "Sum(x) * 1000" will return slightly different results due to rounding . So what's the business intent? Sum up the value and express the result as a factor of 1000 or factor each value by 1000 and sum the total?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;The Qlik engine is very sophisticated&amp;nbsp;in it's optimization. It's&amp;nbsp;&lt;EM&gt;possible&lt;/EM&gt; that "Sum(x) + Sum(y)"would calculate faster than than "Sum(x + y)" because each Sum could be calculated in a separate&amp;nbsp;CPU thread.&amp;nbsp; Qlik is also very good at distributing data rows of a sum across multiple threads, so that may be just as good.&amp;nbsp; I can't observe a significant difference between the two options.&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Critically important though&amp;nbsp;is that&amp;nbsp;"Sum(x) + Sum(y)" and&amp;nbsp;"Sum(x + y)" return a different result (front end )&amp;nbsp;&lt;EM&gt;if x and y are in different tables.&amp;nbsp;&lt;/EM&gt; So you should&amp;nbsp;always first be driven by getting a correct and consistent business result, performance secondary as required.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;All that said, performance in Qlik is always a fascinating topic. My colleague&amp;nbsp;&lt;a href="https://community.qlik.com/t5/user/viewprofilepage/user-id/6152"&gt;@Oleg_Troyansky&lt;/a&gt;&amp;nbsp;leads a three hour Performance session at the &lt;A href="https://masterssummit.com/" target="_self"&gt;Masters Summit for Qlik&lt;/A&gt; if you want to geek out on Qlik performance details&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-Rob&lt;BR /&gt;&lt;A href="http://masterssummit.com" target="_blank"&gt;http://masterssummit.com&lt;/A&gt;&lt;BR /&gt;&lt;A href="http://qlikviewcookbook.com" target="_blank"&gt;http://qlikviewcookbook.com&lt;/A&gt;&lt;BR /&gt;&lt;A href="http://www.easyqlik.com" target="_blank"&gt;http://www.easyqlik.com&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 16 Jun 2019 05:23:28 GMT</pubDate>
      <guid>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1592299#M43520</guid>
      <dc:creator>rwunderlich</dc:creator>
      <dc:date>2019-06-16T05:23:28Z</dc:date>
    </item>
    <item>
      <title>Re: Best practices for performance?</title>
      <link>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1592332#M43522</link>
      <description>&lt;P&gt;Thanks for the detailed reply, which was very helpful.&amp;nbsp; With that many rows, I was not expecting aggregating before multiplying to be only 5% faster than multiplying before aggregating - interesting.&lt;/P&gt;&lt;P&gt;I have some follow-up questions if you have time:&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;P&gt;&lt;SPAN&gt;Critically important though&amp;nbsp;is that&amp;nbsp;"Sum(x) + Sum(y)" and&amp;nbsp;"Sum(x + y)" return a different result (front end )&amp;nbsp;&lt;EM&gt;if x and y are in different tables.&lt;/EM&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;My load script creates one big Qlik table from numerous database tables, where each Qlik row represents a sale.&amp;nbsp; It does this importing the first database table, left joining each subsequent table to it, and then doing a resident load to replace any nulls with appropriate defaults such as 0 for numeric columns and 'None' for text columns.&amp;nbsp; I believe this means that in my case, those two formulas should always be equivalent, no?&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;P&gt;&lt;SPAN&gt;For large numbers,&amp;nbsp; "Sum(x * 1000)" and "Sum(x) * 1000" will return slightly different results due to rounding . So what's the business intent? Sum up the value and express the result as a factor of 1000 or factor each value by 1000 and sum the total?&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;I have two table columns: [cost per thousand pieces] and [pieces].&amp;nbsp; To get the total combined cost for all selected sales, I have to do [cost per thousand pieces] * [pieces] / 1000 and aggregate via Sum().&amp;nbsp; Therefore, I must choose between these options:&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;measure = Sum([cost per thousand pieces] * [pieces]) / 1000
measure = Sum([cost per thousand pieces] * [pieces] / 1000)
measure = Sum([cost per thousand pieces] / 1000 * [pieces])&lt;/LI-CODE&gt;&lt;P&gt;&lt;SPAN&gt;I don't really know which would have the least rounding error or which would express the business intent most directly.&amp;nbsp; I suspect it doesn't really matter since I don't really need high precision here, but do you have any comments?&amp;nbsp; Actually, at this point, I'm leaning toward making a new field in the load script instead since I don't see why everything can't be precalculated except for the aggregation:&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[new cost field] = [cost per thousand pieces] * [pieces] / 1000
measure = Sum([new cost field])&lt;/LI-CODE&gt;&lt;P&gt;&lt;SPAN&gt;I also need to display&amp;nbsp;[cost per thousand pieces] and [pieces] themselves, so I plan to retain those original fields.&amp;nbsp; And whichever method I pick will be duplicated a bunch of times since I have to go through this with a dozen different cost columns.&amp;nbsp; I could also take the middle ground by doing some of the non-aggregating math in the load script and some of it the fly, like this:&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[new thousands of pieces field] = [pieces] / 1000
measure = Sum([cost per thousand pieces] * [new thousands of pieces field])&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 16 Jun 2019 17:58:41 GMT</pubDate>
      <guid>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1592332#M43522</guid>
      <dc:creator>mmarchese</dc:creator>
      <dc:date>2019-06-16T17:58:41Z</dc:date>
    </item>
    <item>
      <title>Re: Best practices for performance?</title>
      <link>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1592340#M43524</link>
      <description>&lt;P&gt;"&lt;SPAN&gt;&amp;nbsp;I believe this means that in my case, those two formulas should always be equivalent, no?"&amp;nbsp; Yes, I would agree.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"Actually, at this point, I'm leaning toward making a new field in the load script instead since I don't see why everything can't be precalculated except for the aggregation"&amp;nbsp; I think this is the best approach for performance and consistency.&amp;nbsp; I would also consider rounding the new values to the same precision as your accounting system, usually .01.&amp;nbsp; That will reduce the overall number of values that Qlik has to store and yield consistent results with any slice of data.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-Rob&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 16 Jun 2019 22:07:46 GMT</pubDate>
      <guid>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1592340#M43524</guid>
      <dc:creator>rwunderlich</dc:creator>
      <dc:date>2019-06-16T22:07:46Z</dc:date>
    </item>
    <item>
      <title>Re: Best practices for performance?</title>
      <link>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1592344#M43526</link>
      <description>Ok, thanks!&lt;BR /&gt;&lt;BR /&gt;For others who happen to read this thread, the reasoning behind Rob's tip to round currency to reduce the number of distinct values that Qlik has to store and thus increase performance is explained here:&lt;BR /&gt;&lt;A href="https://community.qlik.com/t5/Qlik-Design-Blog/Symbol-Tables-and-Bit-Stuffed-Pointers/ba-p/1475369" target="_blank"&gt;https://community.qlik.com/t5/Qlik-Design-Blog/Symbol-Tables-and-Bit-Stuffed-Pointers/ba-p/1475369&lt;/A&gt;</description>
      <pubDate>Mon, 17 Jun 2019 01:25:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/App-Development/Best-practices-for-performance/m-p/1592344#M43526</guid>
      <dc:creator>mmarchese</dc:creator>
      <dc:date>2019-06-17T01:25:37Z</dc:date>
    </item>
  </channel>
</rss>

