<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Talend is occupying the entire RAM in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255943#M38489</link>
    <description>Hi, 
&lt;BR /&gt;I have a scenario, where i have tested two cases for performance. 
&lt;BR /&gt; 
&lt;B&gt;Case 1:&amp;nbsp;&lt;/B&gt; 
&lt;BR /&gt;PostgresDB----&amp;gt;Tmap------(filtering and if condition not satisfied rejecting down to text file)-----&amp;gt;txt 
&lt;BR /&gt; 
&lt;B&gt;Case 2:&lt;/B&gt; 
&lt;BR /&gt;PostgresDB(handling the filter case in SQL query)-----&amp;gt;txt 
&lt;BR /&gt; 
&lt;B&gt;Environment:&lt;/B&gt; 
&lt;BR /&gt;Source DB is present in different server.Data is around 2gb. We are using 16gb machine with redhat linux installed. Out of which 6gb were free.&amp;nbsp; 
&lt;BR /&gt; 
&lt;B&gt;Cases tested:&lt;/B&gt; 
&lt;BR /&gt; 
&lt;B&gt;Case 2:&lt;/B&gt; It just took 3-4 minutes of time to load the data. 
&lt;BR /&gt; 
&lt;B&gt;Case 1:&lt;/B&gt; It's taking more than 30 minutes of time. 
&lt;BR /&gt;I have following questions,kindly help me 
&lt;BR /&gt;a)While filtering and rejecting records in Talend(Case 1), entire RAM was occupied and swap memory was used. It makes the job dead slower.&amp;nbsp; 
&lt;BR /&gt; 
&lt;BR /&gt;If the data size was huge,it directly affects the performance. Say for instance, if i need to process 512GB data, my RAM should be more than that? How people can afford 1TB machine in this case? Is it the same case with other ETL tools or i missing something? Kindly clarify.&amp;nbsp; 
&lt;BR /&gt;b)DB filter was very much faster than talend. Do you think, right approach is to push all functionalities inside the DB? 
&lt;BR /&gt;Thanks</description>
    <pubDate>Tue, 28 Feb 2017 10:13:55 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2017-02-28T10:13:55Z</dc:date>
    <item>
      <title>Talend is occupying the entire RAM</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255943#M38489</link>
      <description>Hi, 
&lt;BR /&gt;I have a scenario, where i have tested two cases for performance. 
&lt;BR /&gt; 
&lt;B&gt;Case 1:&amp;nbsp;&lt;/B&gt; 
&lt;BR /&gt;PostgresDB----&amp;gt;Tmap------(filtering and if condition not satisfied rejecting down to text file)-----&amp;gt;txt 
&lt;BR /&gt; 
&lt;B&gt;Case 2:&lt;/B&gt; 
&lt;BR /&gt;PostgresDB(handling the filter case in SQL query)-----&amp;gt;txt 
&lt;BR /&gt; 
&lt;B&gt;Environment:&lt;/B&gt; 
&lt;BR /&gt;Source DB is present in different server.Data is around 2gb. We are using 16gb machine with redhat linux installed. Out of which 6gb were free.&amp;nbsp; 
&lt;BR /&gt; 
&lt;B&gt;Cases tested:&lt;/B&gt; 
&lt;BR /&gt; 
&lt;B&gt;Case 2:&lt;/B&gt; It just took 3-4 minutes of time to load the data. 
&lt;BR /&gt; 
&lt;B&gt;Case 1:&lt;/B&gt; It's taking more than 30 minutes of time. 
&lt;BR /&gt;I have following questions,kindly help me 
&lt;BR /&gt;a)While filtering and rejecting records in Talend(Case 1), entire RAM was occupied and swap memory was used. It makes the job dead slower.&amp;nbsp; 
&lt;BR /&gt; 
&lt;BR /&gt;If the data size was huge,it directly affects the performance. Say for instance, if i need to process 512GB data, my RAM should be more than that? How people can afford 1TB machine in this case? Is it the same case with other ETL tools or i missing something? Kindly clarify.&amp;nbsp; 
&lt;BR /&gt;b)DB filter was very much faster than talend. Do you think, right approach is to push all functionalities inside the DB? 
&lt;BR /&gt;Thanks</description>
      <pubDate>Tue, 28 Feb 2017 10:13:55 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255943#M38489</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-02-28T10:13:55Z</dc:date>
    </item>
    <item>
      <title>Re: Talend is occupying the entire RAM</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255944#M38490</link>
      <description>tMap fast when it do inMemory calculation, so as for any inMemory databases (we not told now about compression) - if You want work with 1Tb of data in memory, You must have 2Tb of Ram at least 
&lt;BR /&gt;Database will work faster, because it use indexes for JOIN (if You are &amp;nbsp;do not prepare wrong query). PostgreSQL as many other designed for work with data many times bigger than memory 
&lt;BR /&gt;Notes - all above correct if You make JOIN lookups in tMap, or aggregations, so not work with single row from flow 
&lt;BR /&gt;if You just filter - need to check what You try achieve, and may be it possible todo by other ways</description>
      <pubDate>Tue, 28 Feb 2017 11:41:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255944#M38490</guid>
      <dc:creator>vapukov</dc:creator>
      <dc:date>2017-02-28T11:41:36Z</dc:date>
    </item>
    <item>
      <title>Re: Talend is occupying the entire RAM</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255945#M38491</link>
      <description>Thanks Vapukov.
&lt;BR /&gt;How to handle data larger than RAM size in Talend ETL? Is there is any other way without pushing that to DB(ELT).</description>
      <pubDate>Thu, 16 Mar 2017 14:20:51 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255945#M38491</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-03-16T14:20:51Z</dc:date>
    </item>
    <item>
      <title>Re: Talend is occupying the entire RAM</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255946#M38492</link>
      <description>&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;Thanks Vapukov.&lt;BR /&gt;How to handle data larger than RAM size in Talend ETL? Is there is any other way without pushing that to DB(ELT).&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;let return to Your original post 
&lt;BR /&gt;You not provide full information, so I can return same question to You in "Human Readable" form 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MACn.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/154443iC5B8CACEF3D12C6A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MACn.png" alt="0683p000009MACn.png" /&gt;&lt;/span&gt; 
&lt;BR /&gt;You need relocate from one house to other and have huge amount of old staff 
&lt;BR /&gt;and You want have Your new house is clean 
&lt;BR /&gt;You have a case: 
&lt;BR /&gt; 
&lt;BR /&gt;Upload all staff (for example 1000 items) to the street, sort them and take 10 items with You 
&lt;BR /&gt;Make a &amp;nbsp;list of 10 items, take them, sit to car and drive to new Home? 
&lt;BR /&gt; 
&lt;BR /&gt;Which way is faster? 
&lt;BR /&gt;And same cases, but when You want take with You 50% of items and must compare them? Situation could be different 
&lt;BR /&gt;Same with Your question - speed off whole Job always will depend what really You try to do? How many (in %) records rejected by filter? is it any aggregations? external lookups? 
&lt;BR /&gt;Short answer - Talend could work with big data sizes, which way faster and better - it always depends from how proper You define the Job. 
&lt;BR /&gt;Talend, Postgres, OS - it is all just items from Your toolbox!&amp;nbsp; 
&lt;BR /&gt;Why You do not want all benefits from all of Your tools and want do all home tasks only by Hummer? 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MACn.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/154443iC5B8CACEF3D12C6A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MACn.png" alt="0683p000009MACn.png" /&gt;&lt;/span&gt;</description>
      <pubDate>Thu, 16 Mar 2017 21:51:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255946#M38492</guid>
      <dc:creator>vapukov</dc:creator>
      <dc:date>2017-03-16T21:51:36Z</dc:date>
    </item>
    <item>
      <title>Re: Talend is occupying the entire RAM</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255947#M38493</link>
      <description>We handled the filter in SQL Query. Our servers are builded with less amount of RAM so don't want the job to consume more in memory.&amp;nbsp; 
&lt;BR /&gt;Thanks</description>
      <pubDate>Sun, 19 Mar 2017 16:14:47 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Talend-is-occupying-the-entire-RAM/m-p/2255947#M38493</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-03-19T16:14:47Z</dc:date>
    </item>
  </channel>
</rss>

