<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic IO Errors when processing large tables in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/IO-Errors-when-processing-large-tables/m-p/2340114#M108179</link>
    <description>Hi Guys, 
&lt;BR /&gt;I'm currently processing a 25M records from a MS SQL Server database table. The source data consists of very large JSON strings stored within the database table, essentially we're talking about 30-40GB of data within the source table. 
&lt;BR /&gt;On small datasets my Job works well, however I've started experiencing problems when processing the production sized volumes (as above). I think the error is memory related but I cant prove this - I've monitored the job and it never seems to use more then 6GB of memory (16 available and set via JVM params). 
&lt;BR /&gt;I'm getting the following error on execution: 
&lt;BR /&gt;I/O Error: There is not enough space on the disk 
&lt;BR /&gt;Invalid state, the Connection object is closed. 
&lt;BR /&gt;Exception in component tMSSqlSP_10 
&lt;BR /&gt;java.sql.SQLException: Invalid state, the Connection object is closed. 
&lt;BR /&gt; at net.sourceforge.jtds.jdbc.TdsCore.checkOpen(TdsCore.java:452) 
&lt;BR /&gt; at net.sourceforge.jtds.jdbc.TdsCore.clearResponseQueue(TdsCore.java:727) 
&lt;BR /&gt; at net.sourceforge.jtds.jdbc.JtdsStatement.initialize(JtdsStatement.java:645) 
&lt;BR /&gt; at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.execute(JtdsPreparedStatement.java:549) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tMSSqlInput_7Process(LoadObj_order_line.java:6735) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tMSSqlInput_4Process(LoadObj_order_line.java:1668) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tMSSqlConnection_2Process(LoadObj_order_line.java:1004) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tJava_1Process(LoadObj_order_line.java:861) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.runJobInTOS(LoadObj_order_line.java:10937) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.runJob(LoadObj_order_line.java:10691) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects.tRunJob_1Process(LoadObjects.java:6719) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects.tMSSqlInput_1Process(LoadObjects.java:3433) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects.tMSSqlConnection_1Process(LoadObjects.java:2649) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects.tJava_1Process(LoadObjects.java:2505) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects$2.run(LoadObjects.java:8040) 
&lt;BR /&gt;I'm assuming the jobs running out of memory when executing SQL statement at the start of the job. How can i prevent this without batching up by source data? 
&lt;BR /&gt;Thanks, 
&lt;BR /&gt;Martin 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009ME4e.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/129974iAF7BABA590512D54/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009ME4e.png" alt="0683p000009ME4e.png" /&gt;&lt;/span&gt;</description>
    <pubDate>Sat, 16 Nov 2024 12:04:22 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2024-11-16T12:04:22Z</dc:date>
    <item>
      <title>IO Errors when processing large tables</title>
      <link>https://community.qlik.com/t5/Talend-Studio/IO-Errors-when-processing-large-tables/m-p/2340114#M108179</link>
      <description>Hi Guys, 
&lt;BR /&gt;I'm currently processing a 25M records from a MS SQL Server database table. The source data consists of very large JSON strings stored within the database table, essentially we're talking about 30-40GB of data within the source table. 
&lt;BR /&gt;On small datasets my Job works well, however I've started experiencing problems when processing the production sized volumes (as above). I think the error is memory related but I cant prove this - I've monitored the job and it never seems to use more then 6GB of memory (16 available and set via JVM params). 
&lt;BR /&gt;I'm getting the following error on execution: 
&lt;BR /&gt;I/O Error: There is not enough space on the disk 
&lt;BR /&gt;Invalid state, the Connection object is closed. 
&lt;BR /&gt;Exception in component tMSSqlSP_10 
&lt;BR /&gt;java.sql.SQLException: Invalid state, the Connection object is closed. 
&lt;BR /&gt; at net.sourceforge.jtds.jdbc.TdsCore.checkOpen(TdsCore.java:452) 
&lt;BR /&gt; at net.sourceforge.jtds.jdbc.TdsCore.clearResponseQueue(TdsCore.java:727) 
&lt;BR /&gt; at net.sourceforge.jtds.jdbc.JtdsStatement.initialize(JtdsStatement.java:645) 
&lt;BR /&gt; at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.execute(JtdsPreparedStatement.java:549) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tMSSqlInput_7Process(LoadObj_order_line.java:6735) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tMSSqlInput_4Process(LoadObj_order_line.java:1668) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tMSSqlConnection_2Process(LoadObj_order_line.java:1004) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tJava_1Process(LoadObj_order_line.java:861) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.runJobInTOS(LoadObj_order_line.java:10937) 
&lt;BR /&gt; at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.runJob(LoadObj_order_line.java:10691) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects.tRunJob_1Process(LoadObjects.java:6719) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects.tMSSqlInput_1Process(LoadObjects.java:3433) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects.tMSSqlConnection_1Process(LoadObjects.java:2649) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects.tJava_1Process(LoadObjects.java:2505) 
&lt;BR /&gt; at mfx_amq.loadobjects_1_0.LoadObjects$2.run(LoadObjects.java:8040) 
&lt;BR /&gt;I'm assuming the jobs running out of memory when executing SQL statement at the start of the job. How can i prevent this without batching up by source data? 
&lt;BR /&gt;Thanks, 
&lt;BR /&gt;Martin 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009ME4e.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/129974iAF7BABA590512D54/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009ME4e.png" alt="0683p000009ME4e.png" /&gt;&lt;/span&gt;</description>
      <pubDate>Sat, 16 Nov 2024 12:04:22 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/IO-Errors-when-processing-large-tables/m-p/2340114#M108179</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T12:04:22Z</dc:date>
    </item>
    <item>
      <title>Re: IO Errors when processing large tables</title>
      <link>https://community.qlik.com/t5/Talend-Studio/IO-Errors-when-processing-large-tables/m-p/2340115#M108180</link>
      <description>I am not sure about if Microsoft has implemented the fetch size feature in the result set. It is often a problem if the option is not set, the driver loads all data into the memory before delivering the first data set to the application (seen in PostgreSQL or MySQL). 
&lt;BR /&gt;Please check your query in a Java based database tool like SQuirrel and check if you get your data and play a bit with the options. If this work, we have to find a way to tweak the MsSQLInput components.</description>
      <pubDate>Sun, 24 Mar 2013 20:47:04 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/IO-Errors-when-processing-large-tables/m-p/2340115#M108180</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-03-24T20:47:04Z</dc:date>
    </item>
  </channel>
</rss>

