Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026 Agenda Now Available: Explore Sessions
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

[resolved] java.lang.OutOfMemoryError: Java heap space during flat file upload

Hi,
I am getting out of memory issue, when i try to upload the flat file, which contains 11 millions records. Since i don't have enough space to increase my heapsize memory, hence i would like to get suggestion on how to split the file and upload instead of using memory space to store in disk space. Since I don't have mapping with other table in my tMap to set true for Store temp data.
Kindly help me asap.
Thanks
Vijay
Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Hello  
It is impossible to store the data in disk on a file component, all the data are read and stored in memory. If you are limited to increase the memory to the job execution, you can try to split the source file to several small files with a job. For example:
tLoop--iterate--tFileInputFullRow--main--tFileOutputDelimited
tLoop: do a for loop.
From:0
To: the total number of lines you have in the source file or set a big number greater than the real number of lines.
Step: the number of lines you want to have in a small file, let's say 1000000
tFileInputFullRow: read the source file line by line.
Header: the start line of file, set the header as ((Integer)globalMap.get("tLoop_1_CURRENT_VALUE")).
Limit: set the limit as the same number of Step parameter on tLoop, let's say 1000000
tFileOutputDelimited: generate N small files which has 1000000 lines of data, set the file name with a dynamic path, for example:
"D:/work/file/test1/"+((Integer)globalMap.get("tLoop_1_CURRENT_ITERATION"))+"out.csv"
Best regards
Shong

View solution in original post

10 Replies
Anonymous
Not applicable
Author

Normally the reading a flat file should not need that much memory because there is no need to keep the datasets in the memory.
I guess there is something wrong in your job. Could you post a screenshot of your job, though we could try to spot potential memory leaks?
willm1
Creator
Creator

I agree with Jlolling... But if all else fails, you could consider using an o/s command/utility such as split (on Linux) and cycle through the split files...
Anonymous
Not applicable
Author

Hi, Thanks for your response. Please find below the screen shot. Basically, we try to check whether the job is already processed for the given date checking from log table. When the job is not processed we use tMap to load data from the flat file. Thanks in advance for your help.
0683p000009MEHh.png
Anonymous
Not applicable
Author

Hi 
I think it is the lookup data that consume most of the memory when doing a join on tMap, try to store the lookup data on disk instead of memory and allocate more memory for the job execution if you have more memory available, refer to this KB article:
https://community.talend.com/t5/Migration-Configuration-and/OutOfMemory-Exception/ta-p/21669?content...
Best regards
Shong
Anonymous
Not applicable
Author

Hi,
As I already mentioned, we don't have join with other table to use tMap and set the Store temp data as true. hence we cannot use that option, please provide any other alternate solution based on the screen shot shared by eshvar
Thanks,
Vijay 
Anonymous
Not applicable
Author

Hi vijay 
we don't have join with other table to use tMap and set the Store temp data as true

From the job screenshot shared by eshvar, we can see there has join on tMap. It is a big job that contains many subjobs, I will suggest to debug the job step by step to see which subjob or component consumes so much memory, only remain one subjob and deactivate other subjobs, run the job and detect which subjob has the error. 
Best regards
Shong
Anonymous
Not applicable
Author

Hi Shong,
We are referring to the second tMap from the screen shot (i.e) reading the the data from the txt (flat file) format to tMap and to the respective table (Account Links) to store the data. Here, we are finding the leakage while reading the data from txt (flat file) to tMap. Hence we would like to reduce this and looking for the option to change the data to store in disk rather then memory or any other alternate option.
Thanks, Vijay
Anonymous
Not applicable
Author

Hello  
It is impossible to store the data in disk on a file component, all the data are read and stored in memory. If you are limited to increase the memory to the job execution, you can try to split the source file to several small files with a job. For example:
tLoop--iterate--tFileInputFullRow--main--tFileOutputDelimited
tLoop: do a for loop.
From:0
To: the total number of lines you have in the source file or set a big number greater than the real number of lines.
Step: the number of lines you want to have in a small file, let's say 1000000
tFileInputFullRow: read the source file line by line.
Header: the start line of file, set the header as ((Integer)globalMap.get("tLoop_1_CURRENT_VALUE")).
Limit: set the limit as the same number of Step parameter on tLoop, let's say 1000000
tFileOutputDelimited: generate N small files which has 1000000 lines of data, set the file name with a dynamic path, for example:
"D:/work/file/test1/"+((Integer)globalMap.get("tLoop_1_CURRENT_ITERATION"))+"out.csv"
Best regards
Shong
Anonymous
Not applicable
Author

Hi Shong,
Thanks for your valuable Input, we have followed as you mentioned like  "tLoop--iterate--tFileInputFullRow--main--tFileOutputDelimited" its worked well by splitting the file as I set for around 50k records for each file, which has bring down the memory size to < 50MB on each file and started splitting as multiple files, then I used tFileDelete to remove these temp or multiple files after each execution, Now i can able to run the millions of records on each file with less memory space.
Thanks with Regards,
Vijay