<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Improve performance when do a lookup in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261342#M42150</link>
    <description>&lt;P&gt;First of all. are you getting the data from the same database? If so, you should consider joining and filtering in the DB. There is little point bringing in more data than you need to only throw it away.&lt;/P&gt;</description>
    <pubDate>Tue, 09 May 2017 12:54:17 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2017-05-09T12:54:17Z</dc:date>
    <item>
      <title>Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261341#M42149</link>
      <description>&lt;P&gt;Hi team,&lt;/P&gt; 
&lt;P&gt;When i run a subjob, the main table has more than 200+ thousands records. I used "Lookup Model:Reload at each row" to filter&amp;nbsp;Lookup table in tMap. Also added conditions in main table. However, the whole of job still will take about 20min.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 897px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lu6k.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/153304iFC804F8EDC2EA57C/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lu6k.png" alt="0683p000009Lu6k.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;I did some filers through globalmap in lookup table.&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 572px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lu6y.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/131210iE51CA18B820CE34C/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lu6y.png" alt="0683p000009Lu6y.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;In tMap, implement "Left Out Join" and "Reload at each row" lookup model.&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 304px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Ltmg.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/144542iD43C9C4D898AA3BC/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Ltmg.png" alt="0683p000009Ltmg.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Complete to run the whole of job will take so long time(20min).&lt;/P&gt; 
&lt;P&gt;Do you have any idea to do the performance tunning?&lt;/P&gt; 
&lt;P&gt;BTW are there any good ways to use connection pool in Talend?&lt;/P&gt; 
&lt;P&gt;Thank you in advance!&lt;/P&gt;</description>
      <pubDate>Tue, 09 May 2017 11:51:16 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261341#M42149</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-09T11:51:16Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261342#M42150</link>
      <description>&lt;P&gt;First of all. are you getting the data from the same database? If so, you should consider joining and filtering in the DB. There is little point bringing in more data than you need to only throw it away.&lt;/P&gt;</description>
      <pubDate>Tue, 09 May 2017 12:54:17 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261342#M42150</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-09T12:54:17Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261343#M42151</link>
      <description>Hi,
&lt;BR /&gt;Did you measure response time with these queries from outside of Talend?
&lt;BR /&gt;Also (will not solve the problem) you should remove all fields from the select part for which you know the value (they are in the where clause). You'll have less data to transfer from the db server.</description>
      <pubDate>Tue, 09 May 2017 23:25:23 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261343#M42151</guid>
      <dc:creator>TRF</dc:creator>
      <dc:date>2017-05-09T23:25:23Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261344#M42152</link>
      <description>&lt;P&gt;The data is from different database.&lt;/P&gt;&lt;P&gt;I think i have already added enough conditions to filter in the DB, at the same time, did some deals in the tMap, so i have no idea how to improve the performance through other ways!&lt;/P&gt;</description>
      <pubDate>Wed, 10 May 2017 09:45:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261344#M42152</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-10T09:45:45Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261345#M42153</link>
      <description>The data is from different database.&lt;BR /&gt;I think i have already added enough conditions to filter in the DB, at the same time, did some deals in the tMap, so i have no idea how to improve the performance through other ways!</description>
      <pubDate>Wed, 10 May 2017 09:46:20 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261345#M42153</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-10T09:46:20Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261346#M42154</link>
      <description>&lt;P&gt;OK, it looks like you are doing the sort of things I would try to speed things up. Another thing to test is where the bottleneck is. Can you disconnect your DB write/update components and just run the code where the join is happening. Does that massively speed things up? If so (as I have seen before) this could very well be a insert/update issue. Also, are you inserting or updating.....or both?&lt;/P&gt;</description>
      <pubDate>Wed, 10 May 2017 09:50:38 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261346#M42154</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-10T09:50:38Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261347#M42155</link>
      <description>&lt;P&gt;I just tested the joining part and disconnected the update/insert. I think the bottleneck is from the joining part.&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 914px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lu81.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/134197i264B31A713B479EC/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lu81.png" alt="0683p000009Lu81.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 May 2017 10:18:03 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261347#M42155</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-10T10:18:03Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261348#M42156</link>
      <description>&lt;P&gt;OK. Somewhere where you might get a slight improvement is by removing the joins on the columns you are filtering your lookup on. They are not needed. You will only be returning rows that hold the same values as the row that is in coming in in your main flow of data. Therefore you can remove those from the query and from the join. This shouldn't resolve this issue, but will be more efficient.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;However, before doing that are you sure that loading the lookup on every row is the most efficient way of doing this? Sometimes it is without question, but if your main flow is 200,000 rows, that is 200,000 queries fired to the DB. I am assuming you have probably looked into this, but it is worth considering if you haven't....and easier to test before removing the joins I advised you to remove above.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 May 2017 10:28:40 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261348#M42156</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-10T10:28:40Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261349#M42157</link>
      <description>&lt;P&gt;I tried to remove the joins, however looks like there is no more efficient.&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 444px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Ltzp.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/153685iF569AA6B56B1A8CC/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Ltzp.png" alt="0683p000009Ltzp.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 299px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LtnZ.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/139773i949E2589A9C21B91/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LtnZ.png" alt="0683p000009LtnZ.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;And i think i have to load the lookup on every row because the lookup table has&amp;nbsp;428,634,831 rows. Or else it should be more slower.&lt;/P&gt;</description>
      <pubDate>Wed, 10 May 2017 11:13:47 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261349#M42157</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-10T11:13:47Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261350#M42158</link>
      <description>&lt;P&gt;OK, do you &amp;nbsp;know how many rows are returned for every row that the lookup is fired for? In theory you should only return the rows that you need if you are going to fire 200,000 queries. If you are returning more rows and having to filter them in Java, maybe you can tailor your filtering in your query?&lt;BR /&gt;&lt;BR /&gt;Another solution to this will require some playing around, but may just work. I suspect that you can filter preemptively find some values to filter your lookup query on, before you carry out your lookup once. To do this you will need to change the job to dump your main flow into a tHash component (store it in memory). You have some columns that you need to join on or filter by. Now, if you load the main flow preemptively you *may* be able to find some data to use to filter the lookup by. For example, lets say that your lookup data has a column called "alphabet" which holds every letter from a to z. However, your main flow only returns "a", "g" and "y". If you know this before running the lookup query, you can add an "IN" filter to the query and pass in your comma separated list of "a", "g" and "y" to the SQL query. This might remove 3/4 of yoru lookup data. If you are only loading this once and into memory, you will lose the latency of firing the query for every row and the number of rows to be checked each time is also massively reduced.&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Another thing I have just noticed, do you need to match ALL ROWS on the match model? If you don't then switch this off. It will mean that all data will need to be checked for every incoming row.&lt;/P&gt;</description>
      <pubDate>Wed, 10 May 2017 11:34:14 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261350#M42158</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-10T11:34:14Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261351#M42159</link>
      <description>&lt;P&gt;Just only one row is returned for every row that the lookup is fired for.&lt;/P&gt; 
&lt;P&gt;You are right&amp;nbsp;the best practice is to find some values to filter my lookup query and carry out load once which is more efficient.&lt;/P&gt; 
&lt;P&gt;However, if I&amp;nbsp;can not find some values to filter my lookup query. Can I&amp;nbsp;use "Parallelization" to execute the job? Then use the connection pool to control the tMSSqlConnection. How do you think? Or do you have any idea?&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 956px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lu8p.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/157416i8B161DFDBD499627/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lu8p.png" alt="0683p000009Lu8p.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 May 2017 07:27:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261351#M42159</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-11T07:27:45Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261352#M42160</link>
      <description>&lt;P&gt;Using p&lt;SPAN&gt;arallelization will help in this scenario, but it could also slow things down. The rule of thumb for the number of threads is the number of cores - 1. If you start with this and tweak (up and down) you will find an optimum. This also depends on how much other work your machine is doing.&lt;BR /&gt;&lt;BR /&gt;Connection pooling will not help you here. The connection component will handle connections for every db component in your job....if they are hooked up to the connection component. Connection pooling would only really be of use in the situation where the job is running multiple times in parallel. &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 May 2017 09:12:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261352#M42160</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-11T09:12:10Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261353#M42161</link>
      <description>&lt;P&gt;Yeah, we'd like to run&amp;nbsp;&lt;SPAN&gt;the job in&amp;nbsp;multiple times in parallel. &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;However, seems connection pooling can not be used in TBD 6.2.1. Right?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 12 May 2017 10:35:21 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261353#M42161</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-12T10:35:21Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261354#M42162</link>
      <description>&lt;P&gt;I don't understand how connection pooling would help you. The Connection component maintains the connections for the job. If you have jobs running in parallel (ie multiple instance of a job NOT a section of the job as demonstrated in your screenshots), the Connection component in each instance will handle this. Connection pooling is only really a requirement for a webservice where thousands of instances of a service can be using connections concurrently. This is handled quite easily using Spring and has been supported for &amp;nbsp;quite a while now.&lt;/P&gt;</description>
      <pubDate>Fri, 12 May 2017 10:47:23 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261354#M42162</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-12T10:47:23Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261355#M42163</link>
      <description>&lt;P&gt;Now we resolve the performance issue is to add more filters in lookup table which is simplest. After i modified it that speed up 50%. It's more better than before.&lt;/P&gt; 
&lt;P&gt;Many thanks to your help &lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MACJ.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/133049iD780B7DE0116E4D1/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MACJ.png" alt="0683p000009MACJ.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 03:58:04 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261355#M42163</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-16T03:58:04Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261356#M42164</link>
      <description>&lt;P&gt;Hello Icy,&lt;/P&gt;
&lt;P&gt;Thanks for your feedback and posting your solution here.&lt;/P&gt;
&lt;P&gt;Best regards&lt;/P&gt;
&lt;P&gt;Sabrina&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 04:01:16 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261356#M42164</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-16T04:01:16Z</dc:date>
    </item>
    <item>
      <title>Re: Improve performance when do a lookup</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261357#M42165</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;In your lookup the database server will need to calculate he plan execution each time the query is fired, because of Load at each row property in tMap.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;The best way for lookup like this is to use a parametrized query, so the database server will calculate the plan execution for the query One time, and not for every query fire. So, to do this you'll need to replace the&amp;nbsp;LKPFulfilPackageCharge Lookup table with &lt;STRONG&gt;tOracleRow&lt;/STRONG&gt; and a &lt;STRONG&gt;tParseRecordSet&lt;/STRONG&gt;.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;In the tOracleRow don't concatenate filters in the query, but use parameterized query like &lt;FONT face="courier new,courier"&gt;"select columnA, columnB from Table01 where columnC = &lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MAB6.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/158321i00588DF41617C922/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MAB6.png" alt="0683p000009MAB6.png" /&gt;&lt;/span&gt;ColC"&lt;/FONT&gt;.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;That will a lot improve your Job performance.&lt;/P&gt; 
&lt;P&gt;Best regards.&lt;/P&gt;</description>
      <pubDate>Wed, 05 Jul 2017 12:55:57 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Improve-performance-when-do-a-lookup/m-p/2261357#M42165</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-05T12:55:57Z</dc:date>
    </item>
  </channel>
</rss>

