<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Perforamce Issue in Talend Job in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328100#M97409</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For below:&lt;/P&gt;
&lt;P&gt;2) It depends on your design.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using a load once option in tmap for the lookup model . how should i make sure that data should be reloaded freshly from lookup when a new batch starts.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;regards,&lt;/P&gt;
&lt;P&gt;Romi&lt;/P&gt;</description>
    <pubDate>Mon, 17 Jul 2017 15:06:47 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2017-07-17T15:06:47Z</dc:date>
    <item>
      <title>Perforamce Issue in Talend Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328093#M97402</link>
      <description>&lt;P&gt;Hi Team,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I am reading a 10 million record set of file in Talend and have transformation based on certain lookup file joins.&lt;/P&gt; 
&lt;P&gt;The join file dataset are&amp;nbsp; also in millions . This is causing a huge performance break while running the job.&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;For perforamce enhancement ,&lt;/P&gt; 
&lt;P&gt;a)&amp;nbsp; I haven't used any sorting component.&lt;/P&gt; 
&lt;P&gt;b) I am using a Temp dir option for tmap.&lt;/P&gt; 
&lt;P&gt;c) In large lookup files , i am only reading that columns that are required for lookup.&lt;/P&gt; 
&lt;P&gt;d) Increase the JVM size to 32GB.&lt;/P&gt; 
&lt;P&gt;e) Have also tried with parallel components (TPartitioner, tcollector etc).&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;After trying all of the above ,steps still the job performance has not improved. Its failing with Out Of memory issue.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Version : Talend 5.2.1&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I have also tried the above steps in Talend 6.1.1 open studio without partition components as they are not available.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Please guide , what more can be done to improve the performace.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jul 2017 12:22:40 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328093#M97402</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-13T12:22:40Z</dc:date>
    </item>
    <item>
      <title>Re: Perforamce Issue in Talend Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328094#M97403</link>
      <description>&lt;P&gt;Can you give us a screenshot of your job and a screenshot of your tMap config please?&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jul 2017 14:13:26 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328094#M97403</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-13T14:13:26Z</dc:date>
    </item>
    <item>
      <title>Re: Perforamce Issue in Talend Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328095#M97404</link>
      <description>&lt;P&gt;Make sure you limit the amount of BigDecimal data type in your schemas. Avoid it if you don't need it. &amp;nbsp;On Java 8, long data type on 64 bit systems are wide enough to only use BigDecimal for monetary calculations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;See this&amp;nbsp;&lt;A href="https://help.talend.com/reader/LaJCcBd9KN6OIRvxbF9vrw/znUHK9naxHBQdZslcjneZQ" target="_blank" rel="nofollow noopener noreferrer"&gt;https://help.talend.com/reader/LaJCcBd9KN6OIRvxbF9vrw/znUHK9naxHBQdZslcjneZQ&lt;/A&gt; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jul 2017 17:03:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328095#M97404</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-13T17:03:36Z</dc:date>
    </item>
    <item>
      <title>Re: Perforamce Issue in Talend Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328096#M97405</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have seen the link given , but there's one problem wherein if i have lookup attached to the flow , those lookups are loaded again and again for every batch run.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;how can this be avoided , so that lookup's are loaded only once.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;
&lt;P&gt;Romi&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jul 2017 13:05:16 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328096#M97405</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-17T13:05:16Z</dc:date>
    </item>
    <item>
      <title>Re: Perforamce Issue in Talend Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328097#M97406</link>
      <description>&lt;P&gt;If it is a small set of data, then you can cache the data into tHashOutput component and use a tHashInput on the lookup. &amp;nbsp;Thus, it will be looking in memory.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;If you are looking huge volume of data that is dependent on the input, then better reload at each batch to avoid loading a huge dataset and cause your job to run out of memory. &amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;The volume of data you are looking up will determin how much RAM you job will need.&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;You may&amp;nbsp;need to play with various design to figure out what works best for you.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jul 2017 13:09:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328097#M97406</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-17T13:09:10Z</dc:date>
    </item>
    <item>
      <title>Re: Perforamce Issue in Talend Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328098#M97407</link>
      <description>&lt;P&gt;i have a source file file 2 millions data and 5 lookup file with same quantity (2 millions recordset) . i am providing a batch of .5 millions i.e 4 batch will run .&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;This means 2 millions recordset for 5 lookup will load 4 times .&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Questions:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;1) Will this reloading of lookup consume more memory?&lt;/P&gt; 
&lt;P&gt;2) When the second batch comes in play , will it consider the first batch loaded lookup file data as well for looking up? if yes , then this means that it will become a one to many relationship if we are assuming unique join values from all the lookup.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Regards,&lt;BR /&gt;Romi&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jul 2017 13:40:55 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328098#M97407</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-17T13:40:55Z</dc:date>
    </item>
    <item>
      <title>Re: Perforamce Issue in Talend Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328099#M97408</link>
      <description>&lt;P&gt;1) Reloading of lookup should not consume more memory since Java will garbage collect for the memory.&lt;/P&gt;
&lt;P&gt;2) It depends on your design.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jul 2017 13:57:16 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328099#M97408</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-17T13:57:16Z</dc:date>
    </item>
    <item>
      <title>Re: Perforamce Issue in Talend Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328100#M97409</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For below:&lt;/P&gt;
&lt;P&gt;2) It depends on your design.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using a load once option in tmap for the lookup model . how should i make sure that data should be reloaded freshly from lookup when a new batch starts.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;regards,&lt;/P&gt;
&lt;P&gt;Romi&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jul 2017 15:06:47 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328100#M97409</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-17T15:06:47Z</dc:date>
    </item>
    <item>
      <title>Re: Perforamce Issue in Talend Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328101#M97410</link>
      <description>&lt;P&gt;If you are doing as per the KB article, each time the loop iterates, the tMap will trigger the lookup again and load again. &amp;nbsp;So you set your tMap to Load Once. &amp;nbsp;The batch iteration helps with that. &amp;nbsp;However, other challenge is that you are looking up from a file. &amp;nbsp;It is harder to do this logic with a file since you generally need to read a whole each time. &amp;nbsp;If you can stage the content of the file in a Staging DB, then design of the job becomes simpler with select statements with upper and lower bound limit.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jul 2017 18:55:16 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Perforamce-Issue-in-Talend-Job/m-p/2328101#M97410</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-07-17T18:55:16Z</dc:date>
    </item>
  </channel>
</rss>

