IO Errors when processing large tables

Anonymous — Sat, 16 Nov 2024 12:04:22 GMT

Hi Guys,
I'm currently processing a 25M records from a MS SQL Server database table. The source data consists of very large JSON strings stored within the database table, essentially we're talking about 30-40GB of data within the source table.
On small datasets my Job works well, however I've started experiencing problems when processing the production sized volumes (as above). I think the error is memory related but I cant prove this - I've monitored the job and it never seems to use more then 6GB of memory (16 available and set via JVM params).
I'm getting the following error on execution:
I/O Error: There is not enough space on the disk
Invalid state, the Connection object is closed.
Exception in component tMSSqlSP_10
java.sql.SQLException: Invalid state, the Connection object is closed.
at net.sourceforge.jtds.jdbc.TdsCore.checkOpen(TdsCore.java:452)
at net.sourceforge.jtds.jdbc.TdsCore.clearResponseQueue(TdsCore.java:727)
at net.sourceforge.jtds.jdbc.JtdsStatement.initialize(JtdsStatement.java:645)
at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.execute(JtdsPreparedStatement.java:549)
at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tMSSqlInput_7Process(LoadObj_order_line.java:6735)
at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tMSSqlInput_4Process(LoadObj_order_line.java:1668)
at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tMSSqlConnection_2Process(LoadObj_order_line.java:1004)
at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.tJava_1Process(LoadObj_order_line.java:861)
at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.runJobInTOS(LoadObj_order_line.java:10937)
at mfx_amq.loadobj_order_line_2_0.LoadObj_order_line.runJob(LoadObj_order_line.java:10691)
at mfx_amq.loadobjects_1_0.LoadObjects.tRunJob_1Process(LoadObjects.java:6719)
at mfx_amq.loadobjects_1_0.LoadObjects.tMSSqlInput_1Process(LoadObjects.java:3433)
at mfx_amq.loadobjects_1_0.LoadObjects.tMSSqlConnection_1Process(LoadObjects.java:2649)
at mfx_amq.loadobjects_1_0.LoadObjects.tJava_1Process(LoadObjects.java:2505)
at mfx_amq.loadobjects_1_0.LoadObjects$2.run(LoadObjects.java:8040)
I'm assuming the jobs running out of memory when executing SQL statement at the start of the job. How can i prevent this without batching up by source data?
Thanks,
Martin

Re: IO Errors when processing large tables

Anonymous — Sun, 24 Mar 2013 20:47:04 GMT

I am not sure about if Microsoft has implemented the fetch size feature in the result set. It is often a problem if the option is not set, the driver loads all data into the memory before delivering the first data set to the application (seen in PostgreSQL or MySQL).
Please check your query in a Java based database tool like SQuirrel and check if you get your data and play a bit with the options. If this work, we have to find a way to tweak the MsSQLInput components.

topic IO Errors when processing large tables in Talend Studio

IO Errors when processing large tables

Re: IO Errors when processing large tables