<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: tJDBCInput issue with large dataset in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tJDBCInput-issue-with-large-dataset/m-p/2346318#M113733</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt; 
&lt;P&gt;Have you tried to allocate more memory to your studio to see if it works? Please look at this article:&lt;A title="TalendHelpCenter:Allocating more memory to Talend Studio" href="https://help.talend.com/reader/XWYVXqDVIHwy7uFCjwQHYw/FQoY~9hnjr0Ta_l7gxhEcg" target="_self" rel="nofollow noopener noreferrer"&gt;TalendHelpCenter:Allocating more memory to Talend Studio&lt;/A&gt;.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Best regards&lt;/P&gt; 
&lt;P&gt;Sabrina&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 29 May 2018 07:42:05 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2018-05-29T07:42:05Z</dc:date>
    <item>
      <title>tJDBCInput issue with large dataset</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tJDBCInput-issue-with-large-dataset/m-p/2346317#M113732</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt; 
&lt;P&gt;I am quite new to Talend and just started using it.&amp;nbsp;I&amp;nbsp;have to copy data from several sources one of them is Postgres DB with SSL enabled. Initially the ready made DB connection for postgres didn't work for SSL and I came across suggestion to use General JDBC with few&amp;nbsp;changes in JDBC URL, it worked. I am able to connect to Postgres DB and&amp;nbsp;retrieve schema.&amp;nbsp;There is one table with 5 million records, 35&amp;nbsp;columns. The requirement is to drop table on each refresh and&amp;nbsp;recreate it.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I am getting below mentioned error&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;[statistics] connecting to socket on port 3819&lt;/EM&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;&lt;EM&gt;[statistics] connected&lt;/EM&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;&lt;EM&gt;Exception in thread "main" java.lang.OutOfMemoryError: Java heap space&lt;/EM&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;&lt;EM&gt;at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1969)&lt;/EM&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;&lt;EM&gt;at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)&lt;/EM&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;&lt;EM&gt;at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:570)&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I am using below mentioned setup&amp;nbsp;&lt;/P&gt; 
&lt;OL&gt; 
 &lt;LI&gt;Created a DB connection using generic JBBC Connection to connect to Postgres DB&lt;/LI&gt; 
 &lt;LI&gt;Retrieve schema&amp;nbsp;&lt;/LI&gt; 
 &lt;LI&gt;Create a job&amp;nbsp;tJDBCInput -&amp;gt; tOracleOutput&lt;/LI&gt; 
&lt;/OL&gt; 
&lt;P&gt;For&amp;nbsp;&lt;SPAN&gt;tJDBCInput&amp;nbsp;below are the setting&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;Somewhere it was&amp;nbsp;mentioned that&amp;nbsp;MySQL and Postgres copies entire input data onto local disk before coping to destination and it was advised&amp;nbsp;that&amp;nbsp; "Enable stream" is the MySQL equivalent of Postgres "Enable Cursor". Hence I used Enable Cusrsor&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture1.JPG" style="width: 737px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LxMG.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/139092i87D9BD41352B0626/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LxMG.jpg" alt="0683p000009LxMG.jpg" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Run&amp;nbsp;setting for the job.&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture2.JPG" style="width: 779px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LxMQ.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/146885i93C85FB2BE897D29/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LxMQ.jpg" alt="0683p000009LxMQ.jpg" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;This&amp;nbsp;runs for a while without coping anything to destination and gives the Java heap space error.&amp;nbsp;I am not sure what else I can do to copy data and make this work, please note that with all these setting if I limit the rows to 3 million&amp;nbsp;everything works.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Few point to help you to help &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt; 
&lt;OL&gt; 
 &lt;LI&gt;XMS/X 4048 is the max I can&amp;nbsp;allocate, don't&amp;nbsp;have any more&amp;nbsp;memory.&lt;/LI&gt; 
 &lt;LI&gt;Execution time is not a big concern as long as the job is robust and executes periodically&lt;/LI&gt; 
 &lt;LI&gt;Entire table with all the columns has to be&amp;nbsp;copied, no exception of removing any column or rows&lt;/LI&gt; 
 &lt;LI&gt;Has to be done&amp;nbsp;as one job and not in batches , table has data in such a way that there&amp;nbsp;it almost not possible to uniquely identify each row using one column.&amp;nbsp;&amp;nbsp;&lt;/LI&gt; 
&lt;/OL&gt; 
&lt;P&gt;Any help/suggestion&amp;nbsp;would be a great help, thanks in advance.&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 May 2018 09:36:42 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tJDBCInput-issue-with-large-dataset/m-p/2346317#M113732</guid>
      <dc:creator>Gourav_King_of_DataLand</dc:creator>
      <dc:date>2018-05-25T09:36:42Z</dc:date>
    </item>
    <item>
      <title>Re: tJDBCInput issue with large dataset</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tJDBCInput-issue-with-large-dataset/m-p/2346318#M113733</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt; 
&lt;P&gt;Have you tried to allocate more memory to your studio to see if it works? Please look at this article:&lt;A title="TalendHelpCenter:Allocating more memory to Talend Studio" href="https://help.talend.com/reader/XWYVXqDVIHwy7uFCjwQHYw/FQoY~9hnjr0Ta_l7gxhEcg" target="_self" rel="nofollow noopener noreferrer"&gt;TalendHelpCenter:Allocating more memory to Talend Studio&lt;/A&gt;.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Best regards&lt;/P&gt; 
&lt;P&gt;Sabrina&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 29 May 2018 07:42:05 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tJDBCInput-issue-with-large-dataset/m-p/2346318#M113733</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2018-05-29T07:42:05Z</dc:date>
    </item>
    <item>
      <title>Re: tJDBCInput issue with large dataset</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tJDBCInput-issue-with-large-dataset/m-p/2346319#M113734</link>
      <description>&lt;P&gt;Hi Sabrina,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for the help. I tried the suggestion&amp;nbsp;mentioned in the post and set the .ini files as recommended&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;-vmargs&lt;BR /&gt;-Xms1024m&lt;BR /&gt;-Xmx4096m&lt;BR /&gt;-Dfile.encoding=UTF-8&lt;BR /&gt;-Dosgi.requiredJavaVersion=1.8&lt;BR /&gt;-XX:+UseG1GC&lt;BR /&gt;-XX:+UseStringDeduplication&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But&amp;nbsp;I am still getting the error message&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Exception in thread "main" java.lang.OutOfMemoryError: Java heap space&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;all the setting as same as&amp;nbsp;mentioned in the original post. Somewhere I read that the data is first stored in the local&amp;nbsp;memory and then copied to destination, is there a way not to store the data and just stream data from source to destination?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 29 May 2018 14:32:54 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tJDBCInput-issue-with-large-dataset/m-p/2346319#M113734</guid>
      <dc:creator>Gourav_King_of_DataLand</dc:creator>
      <dc:date>2018-05-29T14:32:54Z</dc:date>
    </item>
  </channel>
</rss>

