Skip to main content
Announcements
SYSTEM MAINTENANCE: Thurs., Sept. 19, 1 AM ET, Platform will be unavailable for approx. 60 minutes.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Problem reading large files

Hi
I've a job that reads a positional file and processes a positional file. The file is read using a tFileInputPositional file which is connected to a tConvertType component. Also, 6 childJobs are triggered from the job on subJobOk - tFileInputPositional is the first component of the parent job.The input file is 530MB and the records are 354 bytes long.
The job processes correctly for small files but throws a "java.lang.OutOfMemoryError" error on the tFileInput component after 100233 rows for the 530MB file. The xmx is set to 1024M in Windows>Preferences>Talend>Run/Debug.
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.lang.String.<init>(String.java:208)
at java.io.BufferedReader.readLine(BufferedReader.java:331)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at org.talend.fileprocess.delimited.RowParser.readRecord(RowParser.java:156)
at x.y.z.tFileInputPositional_1Process(z.java:7532)

I tried setting the xmx parameter to 1800M and 2048M but it results in unable to create jvm error.
Could not create the Java virtual machine.
Error occurred during initialization of VM
Could not reserve enough space for object heap

Has anyone come across and resolved this issue? is there a different way I should be reading large files?
Labels (3)
11 Replies
amaumont
Contributor III
Contributor III

Indeed Volker, you are right, this is the reason why we have improved the current tAggregateRow into the 3.1.0 M1/M2 releases. So the new tAggregateRow hasn't anymore the very-very ugly nested Maps.
Therefore I will update the "tAggregateRowOpt" into the "Exchange" page (previously called Ecosystem"), which will be the exact copy of the tAggregateRow for 3.1. The "tAggregateRowOpt" is useful for user which will use TOS before version 3.1.

.
Anonymous
Not applicable
Author

Thats interesting. I'll take a look into the new version. I made some brainstorming for my self to avoid the memory consumption (and speed to).