<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to increase rows/second read in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304359#M76144</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;What's the size of your RAM? What's talend product are you using?&lt;/P&gt;&lt;P&gt;Generally speaking, the followings aspects could affect the job performance:&lt;/P&gt;&lt;P&gt; 1. The volume of data, read a large of data set, the performance will degrade.&lt;/P&gt;&lt;P&gt; 2. The structure of data, if there are so many columns on tDBRow, it will consume many memory and much time for transferring the data during the job execution.&lt;/P&gt;&lt;P&gt; 3. The database connection, the job always runs better if the database is installed on local, if the database is on another machine, even you are on VPN, you may have the congestion and latency issues.&lt;/P&gt;&lt;P&gt;Best regards&lt;/P&gt;&lt;P&gt;Sabrina&lt;/P&gt;</description>
    <pubDate>Thu, 05 Nov 2020 08:41:52 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2020-11-05T08:41:52Z</dc:date>
    <item>
      <title>How to increase rows/second read</title>
      <link>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304358#M76143</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I´m trying to create a new table with 2 columns. &lt;/P&gt;&lt;P&gt;That table is the result of join 11 tables with (15 cols/table and 20.000.000 rows/table) .&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The read-speed started excellent (around 40.000rows/sec) but  it started to run much slower. The last 2 hours is running with 4000rows/sec.&lt;/P&gt;&lt;P&gt;How can I increase that speed?? Below attached the entire process and its current speed.&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0693p00000AGPe1AAH.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/134617i78060EB0031B553C/image-size/large?v=v2&amp;amp;px=999" role="button" title="0693p00000AGPe1AAH.png" alt="0693p00000AGPe1AAH.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 01:10:41 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304358#M76143</guid>
      <dc:creator>FGuijarro</dc:creator>
      <dc:date>2024-11-16T01:10:41Z</dc:date>
    </item>
    <item>
      <title>Re: How to increase rows/second read</title>
      <link>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304359#M76144</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;What's the size of your RAM? What's talend product are you using?&lt;/P&gt;&lt;P&gt;Generally speaking, the followings aspects could affect the job performance:&lt;/P&gt;&lt;P&gt; 1. The volume of data, read a large of data set, the performance will degrade.&lt;/P&gt;&lt;P&gt; 2. The structure of data, if there are so many columns on tDBRow, it will consume many memory and much time for transferring the data during the job execution.&lt;/P&gt;&lt;P&gt; 3. The database connection, the job always runs better if the database is installed on local, if the database is on another machine, even you are on VPN, you may have the congestion and latency issues.&lt;/P&gt;&lt;P&gt;Best regards&lt;/P&gt;&lt;P&gt;Sabrina&lt;/P&gt;</description>
      <pubDate>Thu, 05 Nov 2020 08:41:52 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304359#M76144</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2020-11-05T08:41:52Z</dc:date>
    </item>
    <item>
      <title>Re: How to increase rows/second read</title>
      <link>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304360#M76145</link>
      <description>&lt;P&gt;Hi xdshi,&lt;/P&gt;&lt;P&gt;Thanks for your answer.&lt;/P&gt;&lt;P&gt;I´m using Talend 7.3.1.20200219_1130.&lt;/P&gt;&lt;P&gt;RAM is 8GB on in a server processor with 6 cores.&lt;/P&gt;&lt;P&gt;Memory configuration in Talend is: Xms2049M, Xmx8192M.&lt;/P&gt;&lt;P&gt;The 11 tables have around 80.000.000 rows/table and each table has in between 20GB and 40GB.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The problem I have is that running gives me 2 different messages: &lt;/P&gt;&lt;UL&gt;&lt;LI&gt;"java.sql.SQLNonTransientConnectionException: (conn=-248284684) unexpected end of stream, read 50 bytes from 84 (socket was closed by server)"&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;and&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;java.lang.OutOfMemoryError: Java heap space&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Thanks for your help!!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2020 10:00:20 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304360#M76145</guid>
      <dc:creator>FGuijarro</dc:creator>
      <dc:date>2020-11-09T10:00:20Z</dc:date>
    </item>
    <item>
      <title>Re: How to increase rows/second read</title>
      <link>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304361#M76146</link>
      <description>&lt;P&gt;Hello , &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does this perf issue occur for 1 or all the jobs ?&lt;/P&gt;&lt;P&gt;if all the jobs : &lt;/P&gt;&lt;P&gt;Could you add your workspace on antivirus exclude list ? &lt;/P&gt;&lt;P&gt;Could you disable the drive indexing made by the operating system ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If only this job: &lt;/P&gt;&lt;P&gt;What is the allocate heap size ? (Xmx value on ini file) &lt;/P&gt;&lt;P&gt;Where is located the db: on a local drive or a remote one &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can you check if the performances impove if you set parallelization on the job (right click / enable parallelizationn) &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2020 10:28:08 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304361#M76146</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2020-11-09T10:28:08Z</dc:date>
    </item>
    <item>
      <title>Re: How to increase rows/second read</title>
      <link>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304362#M76147</link>
      <description>&lt;P&gt;Hi tsesdl,&lt;/P&gt;&lt;P&gt;Thanks for your answer! I´ve tried with only one table and also occurs.&lt;/P&gt;&lt;P&gt;System under my Talend is running is:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;B&gt;Intel Xeon Gold CPU@2.30Ghx (2 processors)&lt;/B&gt;&lt;/LI&gt;&lt;LI&gt;&lt;B&gt;Windows Server 2019 64bits&lt;/B&gt;&lt;/LI&gt;&lt;LI&gt;&lt;B&gt;RAM: 8GB&lt;/B&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;If checking JVM memory assigned with command prompt:&lt;/P&gt;&lt;P&gt;    java -Xshowsettings: vm&lt;/P&gt;&lt;P&gt;&lt;B&gt;        VM settings:&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;&amp;nbsp;           &amp;nbsp;Max. Heap Size (Estimated): 1.78G&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;&amp;nbsp;&amp;nbsp;           Ergonomics Machine Class: client&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;&amp;nbsp;&amp;nbsp;           Using VM: Java HotSpot(TM) 64-Bit Server VM&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In Talend, before running, I´m always changing in "Run" tab memory to:&lt;/P&gt;&lt;P&gt;&lt;B&gt;   -Xms256M&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;   -Xmx8016M&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, the file &lt;B&gt;TOS_DI-win-x86_64.ini&lt;/B&gt; contains:&lt;/P&gt;&lt;P&gt;&lt;B&gt;-vmargs&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;-Xms512m&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;-Xmx1536m&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;-Dfile.encoding=UTF-8&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;-Dosgi.requiredJavaVersion=1.8&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;-XX:+UseG1GC&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;-XX:+UseStringDeduplication&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;-XX:MaxMetaspaceSize=512m&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Running job with parallelization, I get this error message:&lt;/P&gt;&lt;P&gt;&lt;B&gt;"Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded"&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for your help!&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2020 11:18:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/How-to-increase-rows-second-read/m-p/2304362#M76147</guid>
      <dc:creator>FGuijarro</dc:creator>
      <dc:date>2020-11-09T11:18:45Z</dc:date>
    </item>
  </channel>
</rss>

