<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Memory Issue while performing joins on data in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Memory-Issue-while-performing-joins-on-data/m-p/2335897#M104403</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I am joining two files having 40 fields each with&amp;nbsp;few millions of data, however it is throwing an error like below&lt;/P&gt; 
&lt;P&gt;--------------------------------------------------&lt;/P&gt; 
&lt;P&gt;&lt;STRONG&gt;java.lang.OutOfMemoryError: GC overhead limit exceeded&lt;/STRONG&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;--------------------------------------------------&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;I tried saving temp data on some temporary location through tmap component. Also enabled JVM Arguments, still it is not able to process the required data.&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Please suggest.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Thanks!&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 21 May 2018 16:42:54 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2018-05-21T16:42:54Z</dc:date>
    <item>
      <title>Memory Issue while performing joins on data</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Memory-Issue-while-performing-joins-on-data/m-p/2335897#M104403</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I am joining two files having 40 fields each with&amp;nbsp;few millions of data, however it is throwing an error like below&lt;/P&gt; 
&lt;P&gt;--------------------------------------------------&lt;/P&gt; 
&lt;P&gt;&lt;STRONG&gt;java.lang.OutOfMemoryError: GC overhead limit exceeded&lt;/STRONG&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;--------------------------------------------------&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;I tried saving temp data on some temporary location through tmap component. Also enabled JVM Arguments, still it is not able to process the required data.&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Please suggest.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Thanks!&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 21 May 2018 16:42:54 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Memory-Issue-while-performing-joins-on-data/m-p/2335897#M104403</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2018-05-21T16:42:54Z</dc:date>
    </item>
    <item>
      <title>Re: Memory Issue while performing joins on data</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Memory-Issue-while-performing-joins-on-data/m-p/2335898#M104404</link>
      <description>&lt;P&gt;Happened to me too, just get rid of all columns which are not necessary for the matching statement, join this additional data after you've finished matching... things will speed up. Especially if you have big string values (255 chars) that will drain performance!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 21 May 2018 17:54:24 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Memory-Issue-while-performing-joins-on-data/m-p/2335898#M104404</guid>
      <dc:creator>Jesperrekuh</dc:creator>
      <dc:date>2018-05-21T17:54:24Z</dc:date>
    </item>
    <item>
      <title>Re: Memory Issue while performing joins on data</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Memory-Issue-while-performing-joins-on-data/m-p/2335899#M104405</link>
      <description>&lt;P&gt;as suggested already - You could try to reduce size of joined data&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;but in any case first of all You must calculate - what really memory size You need for fit in?&lt;/P&gt; 
&lt;P&gt;few millions, it not full description, file with few millions rows with 40 columns, should be easily 10-20-100Gb, and You need a lot of memory for join them.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Alternative solution - put both in database in indexed tables and do the JOIN in database (even if You need final result as csv file)&lt;/P&gt; 
&lt;P&gt;Databases&amp;nbsp;much more oriented for work with huge data with limited memory resources.&lt;/P&gt;</description>
      <pubDate>Tue, 22 May 2018 04:07:12 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Memory-Issue-while-performing-joins-on-data/m-p/2335899#M104405</guid>
      <dc:creator>vapukov</dc:creator>
      <dc:date>2018-05-22T04:07:12Z</dc:date>
    </item>
  </channel>
</rss>

