<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Performance issue with below design in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270601#M48462</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;I am struggling with your job design. tMap_3 and tMap_4 have no output at all and therefore useless.&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;Thanks.I have 8 columns like A,B,C,D,E,F,G,H. Filtering the records on C,D,E,F,G(tmap3 &amp;amp; tmap4) and I'm taking only A,B,H to reference buffer(tmap1 &amp;amp; tmap2). It can be accomplished without using tmap3 &amp;amp; tmap4, but with the cost of taking all the columns A,B,C,D,E,F,G,H to reference buffer. Correct me, if I'm wrong.</description>
    <pubDate>Tue, 23 Feb 2016 14:47:57 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2016-02-23T14:47:57Z</dc:date>
    <item>
      <title>Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270593#M48454</link>
      <description>Gurus, 
&lt;BR /&gt;I'm new to talend. Got struck with the performance issue. Kindly help me, to fix it. 
&lt;BR /&gt;I have records in million. No chance to extract the data from db, all were from file. 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MDId.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/141390i43186438DF8A4DC3/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MDId.png" alt="0683p000009MDId.png" /&gt;&lt;/span&gt; 
&lt;BR /&gt;&amp;gt;&amp;gt;Removing duplicates and retaining the record based on max date(used tsort and tuniq component) 
&lt;BR /&gt;&amp;gt;&amp;gt;Used different filter conditions on tmap and tfilterrow. 
&lt;BR /&gt; 
&lt;BR /&gt;Job failed due to "main" java.lang.OutOfMemoryError: GC overhead limit exceeded 
&lt;BR /&gt;&amp;gt;&amp;gt;Increased the VM Argument&amp;nbsp;to -Xmx4096M 
&lt;BR /&gt;But still, I got the same error. 
&lt;BR /&gt;&amp;gt;&amp;gt;Written to temp file in tmap and sorted on disk in tsort, 
&lt;BR /&gt;Got the same error. 
&lt;BR /&gt; 
&lt;BR /&gt;My questions: 
&lt;BR /&gt;--&amp;gt;Sort is the main culprit. Any other possible ways to sort the data(don't have staging db to sort)? 
&lt;BR /&gt;--&amp;gt;I'm reading the same reference file twice,why because I cannot redirect the single tinputdelimited to two tmap reference. Is there is any way to read the file only once? 
&lt;BR /&gt;--&amp;gt;How the overall design can be improved? 
&lt;BR /&gt;Some guidance will be greatly helpful. 
&lt;BR /&gt;Thanks</description>
      <pubDate>Tue, 16 Feb 2016 12:40:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270593#M48454</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-16T12:40:10Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270594#M48455</link>
      <description>I am struggling with your job design. tMap_3 and tMap_4 have no output at all and therefore useless.</description>
      <pubDate>Tue, 16 Feb 2016 22:53:52 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270594#M48455</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-16T22:53:52Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270595#M48456</link>
      <description>In reference files,I have 8 columns. I'm doing some filter &amp;amp; removing few columns from tmap3 &amp;amp; tmap4(cannot hold all the unwanted columns in lookup buffer). It can be integrated in tmap1 and tmap2, also but for the sake of debugging(trace count from each link), I have used tmap3 &amp;amp; tmap4. Do you think, that will affect the performance?
&lt;BR /&gt;Thanks</description>
      <pubDate>Wed, 17 Feb 2016 06:45:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270595#M48456</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-17T06:45:37Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270596#M48457</link>
      <description>Hi&amp;nbsp;
&lt;B&gt;&lt;FONT color="#5b5b5d"&gt;&lt;FONT size="2"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;rajmhn,&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;
&lt;BR /&gt;
&lt;B&gt;&lt;FONT color="#5b5b5d"&gt;&lt;FONT size="2"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;You don't require tmap3 and tmap4 as well you can filter those columns in tmap1 &amp;amp; 2 &amp;nbsp;and enable sort on disk option in tsort advanced settings.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;
&lt;BR /&gt;You can also try by removing filter step and do filter in tmap(i don't think so will get performance but try once).
&lt;BR /&gt;Thanks,
&lt;BR /&gt;Siva.</description>
      <pubDate>Wed, 17 Feb 2016 07:27:43 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270596#M48457</guid>
      <dc:creator>lvsiva</dc:creator>
      <dc:date>2016-02-17T07:27:43Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270597#M48458</link>
      <description>Thanks siva.
&lt;BR /&gt;You don't require tmap3 and tmap4 as well you can filter those columns in tmap1 &amp;amp; 2
&lt;BR /&gt;&amp;gt;&amp;gt;In reference, I have 8 columns. Only few required. I cannot take all the columns into tmap buffer. That's the reason, why I used tmap and also I'm filtering records based on few conditions(though can be implemented in tmap1 &amp;amp; tmap2).So I incorporated both functionalities in tmap3 &amp;amp; tmap4.&amp;nbsp;
&lt;BR /&gt;Do you think, without using tmap2 &amp;amp; tmap3, it can be accomplished?
&lt;BR /&gt;sort on disk option in tsort advanced settings
&lt;BR /&gt;&amp;gt;&amp;gt;I already enabled it.
&lt;BR /&gt;Job was very resource consuming. I'm getting 3 million records from source &amp;amp; 2.5 million from each reference. Allocated Xmx16384. Job completed in 6 minutes.
&lt;BR /&gt;One general question, what is the sort algorithm used in tsortrow?
&lt;BR /&gt;Thanks</description>
      <pubDate>Wed, 17 Feb 2016 08:45:02 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270597#M48458</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-17T08:45:02Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270598#M48459</link>
      <description>Someone please help me out.</description>
      <pubDate>Thu, 18 Feb 2016 17:14:53 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270598#M48459</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-18T17:14:53Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270599#M48460</link>
      <description>Hi rajmhn,&lt;BR /&gt;&lt;BLOCKQUOTE&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;In reference, I have 8 columns. Only few required. I cannot take all the columns into tmap buffer. That's the reason, why I used tmap and also I'm filtering records based on few conditions(though can be implemented in tmap1 &amp;amp; tmap2).So I incorporated both functionalities in tmap3 &amp;amp; tmap4.&amp;nbsp; &lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;It can be achieved in tMap_1 and tMap_2 without using&amp;nbsp;&lt;FONT color="#5b5b5d"&gt;&lt;FONT size="2"&gt;tmap3 and tmap4.&lt;/FONT&gt;&lt;/FONT&gt;&lt;BR /&gt;What's&amp;nbsp;the current rows/s during the data processing(row rate)?&lt;BR /&gt;Best regards&lt;BR /&gt;Sabrina</description>
      <pubDate>Tue, 23 Feb 2016 09:55:13 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270599#M48460</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-23T09:55:13Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270600#M48461</link>
      <description>Thanks Sabrina.
&lt;BR /&gt;It can be achieved in tMap_1 and tMap_2 without using tmap3 and tmap4.
&lt;BR /&gt;&amp;gt;&amp;gt;I have 8 columns like A,B,C,D,E,F,G,H. Filtering the records on C,D,E,F,G(tmap3 &amp;amp; tmap4) and I'm taking only A,B,H to reference buffer(tmap1 &amp;amp; tmap2). It can be accomplished without using tmap3 &amp;amp; tmap4, but with the cost of taking all the columns A,B,C,D,E,F,G,H to reference buffer. Correct me, if I'm wrong.
&lt;BR /&gt;What's the current rows/s during the data processing(row rate)?
&lt;BR /&gt;&amp;gt;&amp;gt;It was around 5000 rows/sec
&lt;BR /&gt;Solutions to consider:
&lt;BR /&gt;&amp;gt;&amp;gt;Split the jobs into two, one till tmap2 and other job for sorting and remove duplicates.
&lt;BR /&gt;&amp;gt;&amp;gt;Writing temp data on disk and assigning less JVM Xmx memory
&lt;BR /&gt;&amp;gt;&amp;gt;Assigning more JVM Xmx memory
&lt;BR /&gt;Which one would be feasible one?
&lt;BR /&gt;Thanks</description>
      <pubDate>Tue, 23 Feb 2016 12:51:16 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270600#M48461</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-23T12:51:16Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270601#M48462</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;I am struggling with your job design. tMap_3 and tMap_4 have no output at all and therefore useless.&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;Thanks.I have 8 columns like A,B,C,D,E,F,G,H. Filtering the records on C,D,E,F,G(tmap3 &amp;amp; tmap4) and I'm taking only A,B,H to reference buffer(tmap1 &amp;amp; tmap2). It can be accomplished without using tmap3 &amp;amp; tmap4, but with the cost of taking all the columns A,B,C,D,E,F,G,H to reference buffer. Correct me, if I'm wrong.</description>
      <pubDate>Tue, 23 Feb 2016 14:47:57 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270601#M48462</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-23T14:47:57Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270602#M48463</link>
      <description>Hi, 
&lt;BR /&gt; 
&lt;FONT color="black"&gt;&lt;FONT size="2"&gt;t&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT color="black"&gt;&lt;FONT size="3"&gt;&lt;FONT face="Calibri"&gt;Map is a cache component consuming two much memory. For a large set of data, try to store the data on disk. Did you get any outofmemory issue on your end?&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt; 
&lt;FONT color="black"&gt;&lt;FONT size="3"&gt;&lt;FONT face="Calibri"&gt;Have you already checked the document about:&lt;A href="https://community.qlik.com/s/article/ka03p0000006EZuAAM" target="_blank"&gt;TalendHelpCenter:Exception outOfMemory&lt;/A&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt; 
&lt;FONT color="black"&gt;&lt;FONT size="3"&gt;&lt;FONT face="Calibri"&gt;Would you mind uploading your tmap3 and tmap4 map editor screenshot into forum?&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina</description>
      <pubDate>Wed, 02 Mar 2016 03:48:26 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270602#M48463</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-03-02T03:48:26Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270603#M48464</link>
      <description>Thanks.
&lt;BR /&gt;Yes, initially I got the Outofmemory issue. I tried two scenarios.
&lt;BR /&gt;Scenario 1:
&lt;BR /&gt;&amp;gt;&amp;gt;Increased the Xmx to 16GB, it worked. Performance was very good(6 min). Is it good idea to use this much memory?
&lt;BR /&gt;Scenario 2:
&lt;BR /&gt;&amp;gt;&amp;gt;Reduced the Xmx to 8GB and used option store on disk in tmap_1 &amp;amp; tmap_2. But performance was not good. With this option tmap is sorting the data and storing into disk before join.
&lt;BR /&gt;Didn't apply store on disk to tmap_3 &amp;amp; tmap_4. Do you think, that will be good idea?
&lt;BR /&gt;I cannot upload the screen shot. Getting this issue&amp;nbsp;
&lt;BR /&gt;
&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;Error : The server was unable to save the uploaded file. Please contact the forum administrator at&lt;/FONT&gt;
&lt;BR /&gt;
&lt;BR /&gt;In tmap_3 &amp;amp; tmap_4, I removed the unwanted columns(9 to 3 columns) and filtered the records based on few condition.
&lt;BR /&gt;Thanks</description>
      <pubDate>Thu, 03 Mar 2016 17:11:03 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270603#M48464</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-03-03T17:11:03Z</dc:date>
    </item>
    <item>
      <title>Re: Performance issue with below design</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270604#M48465</link>
      <description>You could split this into multiple jobs to do the filtering and deduplicating. Then pass the cleaned data into the job above without tMap_3 &amp;amp; 4 and remove sort&amp;nbsp;, tUnique, and tFilterRow.</description>
      <pubDate>Fri, 04 Mar 2016 08:14:21 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-issue-with-below-design/m-p/2270604#M48465</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-03-04T08:14:21Z</dc:date>
    </item>
  </channel>
</rss>

