Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
sushantk19
Creator
Creator

Java heap space issue while reading large of data from source

Hi,

 

I am trying to read large amount of data approx 3 to 5 millions records from SQL server like source(Heidisql). I keep getting the Java heap space error:

 

error is:
Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space
at java.util.LinkedList.listIterator(Unknown Source)
at java.util.AbstractList.listIterator(Unknown Source)

 

I read some of the community posts about it also documentation around it. Things I tried are :

 

1. Enable Stream option is checked

2. Increasing JVM parameters  ie xms2048M and Xmx4096M

 

Anything else i can do  to handle this?

My job design is tmysqlinput--->tLogrow--->tMap--->tDBoutput.

 

 

 

Labels (2)
11 Replies
sushantk19
Creator
Creator
Author

@xdshi :Thanks for your Suggestion!  I have removed tLogRow component from my job and now re-testing my job with this code. Please find attached the screenshot of the tMap config settings.

 

Also, i noticed no temp files are being generated at the defined path. What could be reason for the same?


java_heap_issue.png
Jackson0123
Contributor
Contributor

The java.lang.OutOfMemoryError: Java heap space error in your case is happening because the job is trying to load too many records into memory at once. Even though you have already increased the JVM options this type of issue may come up if the application is holding on to objects longer than necessary. Holding them in memory through components like tMap or tLogRow can quickly exhaust the available heap.

In situations like yours I would suggest you design the job to process the data in smaller batches. It's much much way better than processing such huge data all at once.  For example, configure the input component to use a streaming or cursor-based approach.  This will help the records to get fetched and processed row by row rather than being fully materialized in memory. In addition to this, please check if you are using any intermediate components that store the entire dataset, like tLogRow in debug mode. If yes, please avoid using them since they can keep large lists in memory.

If your transformation logic allows, then you can split the workload by adding filters or partitioning the query  and process each subset in separate runs. Another practical option is to increase heap size further if system resources allow, but this should not be the only solution, streaming and partitioning are much more sustainable approaches for millions of records.

By implementing all the changes that we discussed here, I think mostly you can prevent memory overflows and will also help you in processing your large datasets more efficiently. If you are interested to learn about the OutOfMemoryError: Java heap space error, You can check out this blog How to Solve OutOfMemoryError: Java heap space. This will give you more insights about this Java Heap Space error.