Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
ALERT: The support homepage carousel is not displaying. We are working toward a resolution.

Talend Data Integration: tFileInputRaw cannot load a file as a String if the file size is over 2GB

No ratings
cancel
Showing results for 
Search instead for 
Did you mean: 
Jinge_Dong
Support
Support

Talend Data Integration: tFileInputRaw cannot load a file as a String if the file size is over 2GB

Last Update:

Apr 8, 2024 4:53:17 AM

Updated By:

Shicong_Hong

Created date:

Apr 8, 2024 4:48:23 AM

When using tFileInputRaw to load file contents as a string, the Job fails with an OutOfMemoryError if the file size exceeds 2GB. This happens even if there is enough memory and the JVM parameter "-Xmx" value is increased to a higher value.

Cause

A Java String internally uses a char array (example: char[]) and the indices of an array is an integer. The maximum value of an integer is Integer.MAX_VALUE, which is 2^31 – 1 (or approximately 2 billion). So, you can store a file up to 2 GB in size as a String, and for which you need at least 4 GB memory to store as each char is 2 bytes in Java, plus additional ~4 GB memory for creating the String object, so, in total around 8GB of heap space. 

Reference

Java String & Array limitations and OutOfMemoryError

 

Resolution

Reading ~2 GB file into memory is not a good design. Avoid loading ~2GB large files into strings at once, split the file into smaller files, and then use tFileList to iterate over each file.

 

Environment

Talend Data Integration 

 

Labels (1)
Version history
Last update:
‎2024-04-08 04:53 AM
Updated by: