Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Java heap space problem in Talend Open Studio 4.0.1

Hi there,
I'm a newbie to this forum and Talend in general. I run the above version of Talend in Windows 2007. I use a number of jobs which were written by some contractors we hired about 9 months ago. One of these jobs, Strength01, has been failing with the following messages:-
xception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuilder.append(Unknown Source)
at com.csvreader.CsvReader.readRecord(CsvReader.java:1036)
at core_extract.strengthsub02_1_0.StrengthSub02.tFileList_1Process(StrengthSub02.java:12235)
at core_extract.strengthsub02_1_0.StrengthSub02.tHashInput_tUnite_1Process(StrengthSub02.java:16739)
at core_extract.strengthsub02_1_0.StrengthSub02.runJobInTOS(StrengthSub02.java:17696)
at core_extract.strengthsub02_1_0.StrengthSub02.runJob(StrengthSub02.java:17577)
at core_extract.strength01_1_0.Strength01.tRunJob_1Process(Strength01.java:413)
at core_extract.strength01_1_0.Strength01.runJobInTOS(Strength01.java:652)
at core_extract.strength01_1_0.Strength01.main(Strength01.java:526)

From the information I've gleaned so far from looking this up on the Internet, it is suggested that I edit the file:-
TalendOpenStudio-win32-x86.ini
which contains the line
-vmargs -Xms64m- Xmx768m -XX:MaxPermSize=600m

and change the number following "MaxPermSize". I've tried a number of values, with 600 being the latest one, but the problem still occurs.
On the network where I run Talend we have two PC's that have Talend installed. I got the same error on the second PC. Is the problem related to the ini file on the PC(s) or could it be related to our server?
Am I missing anything?
Is there anything else I should do?
Thank you in anticipation,
Richard
Labels (3)
18 Replies
Anonymous
Not applicable
Author

could you tell us how big are the files you are loading?
it seems it is falling when loading some CSVs.
*note: i suppose you can always call those contractors and ask for them to fix it.
Anonymous
Not applicable
Author

The argument you want to adjust for the heap is -Xmx, try increasing this to 1024m. MaxPermSpace of 128m should be fine.
Use trial-and-error to see if you can find a value that will get your map to run.
It looks like there's a custom CSV reader class that may be reading the entire input file in memory. That's fine for a moderate-sized document (100-200k), but not if the file is large. Can the custom CSV class be replaced with a tFileInputDelimited? That way, the input is processed line-by-line and the overall memory doesn't need to exceed that required for a single row.
Anonymous
Not applicable
Author

The files I'm loading have are big - up to 45 megabytes.
Anonymous
Not applicable
Author

The argument you want to adjust for the heap is -Xmx, try increasing this to 1024m. MaxPermSpace of 128m should be fine.
Use trial-and-error to see if you can find a value that will get your map to run..

I'll give this a try and let you know how I get on
It looks like there's a custom CSV reader class that may be reading the entire input file in memory. That's fine for a moderate-sized document (100-200k), but not if the file is large. Can the custom CSV class be replaced with a tFileInputDelimited? That way, the input is processed line-by-line and the overall memory doesn't need to exceed that required for a single row.

I'll check this out too.
Thanks very much walkerca, I'll let you know how I get on.
alevy
Specialist
Specialist

The -Xmx argument in the .ini file controls the memory usage of the studio itself and thus no further than building a job. It makes no difference to the actual running of the job. The memory allocated to running a job is controlled by default through Window > Preferences > Talend > Run/Debug or for specific jobs under JVM arguments on the left side of the Run tab.
Anonymous
Not applicable
Author

The job fails because StringBuilder object is trying to expand its backing array due to a very big record...
tFileInputDelimited with CSV options uses third party "com.csvreader.CsvReader" under the hood... so is possible that you are using it already... because is present in the stack trace...
you should post the job and example data if you want more optimization insights...
Anonymous
Not applicable
Author

The -Xmx argument in the .ini file controls the memory usage of the studio itself and thus no further than building a job. It makes no difference to the actual running of the job. The memory allocated to running a job is controlled by default through Window > Preferences > Talend > Run/Debug or for specific jobs under JVM arguments on the left side of the Run tab.

Thanks for this.
I've tried increasing the -Xmx by doing Window > Preferences > Talend > Run/Debug but when I increase Xmx from 1024 to 2048 I get the message:-
Could not create the Java virtual machine.
Error occurred during initialization of VM
Could not reserve enough space for object heap
Job Strength01 ended at 09:56 18/05/2011.
Can anyone suggest how I can find out what the maximum value I can set -Xmx to?
Thanks
janhess
Creator II
Creator II

Could be you've described the data structure incorrectly.
Anonymous
Not applicable
Author

The job fails because StringBuilder object is trying to expand its backing array due to a very big record...
tFileInputDelimited with CSV options uses third party "com.csvreader.CsvReader" under the hood... so is possible that you are using it already... because is present in the stack trace...
you should post the job and example data if you want more optimization insights...

Thanks for your comments. I'm afraid I can't post sample data because it it sensitive but I can post details of the job. Can you clarify what particular details please?