Solved: Re: [resolved] OutOfMemoryError: GC overhead limit... - Page 2 - Qlik Community

Anonymous · ‎2013-02-06

I'm doing a simple, three component job - tFileInputExcel > tMap > MSSqBulkOutputExec.
The input file has just 11,923 rows (was just written by another TOSDI job) and the tMap has no processing except some row mapping.
The MSSqBulkOutputExec uses a just-retrieved repository definition for the database table and is set to append to the SQL table.
Execution starts out slow and slows to a crawl until it stops at row 5,287 with the following: (The temporary file, mssql_data.txt finishes with 5,206 rows written.)
Exception in thread "main" java.lang.Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
at masterproviderdatabase.dhpl_all_insertlicenseesintompdproviders_0_1.DHPL_All_InsertLicenseesIntoMPDProviders.tFileInputExcel_1Process(DHPL_All_InsertLicenseesIntoMPDProviders.java:3830)
at masterproviderdatabase.dhpl_all_insertlicenseesintompdproviders_0_1.DHPL_All_InsertLicenseesIntoMPDProviders.runJobInTOS(DHPL_All_InsertLicenseesIntoMPDProviders.java:4012)
at masterproviderdatabase.dhpl_all_insertlicenseesintompdproviders_0_1.DHPL_All_InsertLicenseesIntoMPDProviders.main(DHPL_All_InsertLicenseesIntoMPDProviders.java:3877)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at sun.reflect.GeneratedConstructorAccessor6.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createUnattachedNode(SchemaTypeImpl.java:1859)
disconnected
at org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createElementType(SchemaTypeImpl.java:1021)
at org.apache.xmlbeans.impl.values.XmlObjectBase.create_element_user(XmlObjectBase.java:893)
at org.apache.xmlbeans.impl.store.Xobj.getUser(Xobj.java:1657)
at org.apache.xmlbeans.impl.store.Xobj.find_element_user(Xobj.java:2062)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTCellImpl.getIs(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFCell.getRichStringCellValue(XSSFCell.java:269)
at org.apache.poi.xssf.usermodel.XSSFCell.getRichStringCellValue(XSSFCell.java:64)
at masterproviderdatabase.dhpl_all_insertlicenseesintompdproviders_0_1.DHPL_All_InsertLicenseesIntoMPDProviders.tFileInputExcel_1Process(DHPL_All_InsertLicenseesIntoMPDProviders.java:2457)
... 2 more
I've done much bigger SQL outputs that this. I tied restarting Windows(7) to clear any cobwebs but no difference.
Any suggestions? Thanks!
UPDATE: I just replaced the MSSqBulkOutputExec then the tMap with a tLogRow, disconnecting them until I only have the FileInputExcel and the tLogRow - it still fails the same! ??
UPDATE: On the Advanced Settings tab of the FileInputExcel I found the Generation Mode field and set it to "Less memory consumed . . . " All set now.

soujanyam · ‎2014-08-23

Thanks for your reply,
I split the input directory into parts and read them with different tFilelist and appending them through tUnite. For two directories(less than 50%data) getting result. But for total data not getting.
What might be the proper solution for this?

Anonymous · ‎2014-08-23

Your use case is to iterate through a very large list of files? Unfortunately the tFileList holds a representation of every file (the java.io.File object) in the memory and this causes sometimes a OutOfMemoryExceprion. The only way - currently - to solve this is to filter the files to prevent reading all of them.

soujanyam · ‎2014-08-23

Thanks for your prompt reply,
Thats why i used two tFilelists joins with tUnite follows with tUnite again,another join to this is from another tFilelist and etc.
Even though not getting.
What could I do?

soujanyam · ‎2014-08-23

My requirement is to read the all files from directory in single job and process further.Thats why I used multiple tFilelists with sub directories as input and to join them using tUnite.
Is there any other way to resolve my problem.
Thanks in advance.

Anonymous · ‎2014-08-23

The “java.lang.OutOfMemoryError: GC overhead limit exceeded” error indicates that GC has been trying to free the memory but is pretty much unable to get any job done. By default it happens when the JVM is spending more than 98% of the total time in GC and after GC less than 2% of the heap is recovered.

What would happen if this GC overhead limit was not present? Note that the “ GC overhead limit exceeded” error is thrown only when 2% of the memory was freed after several GC cycles. This means that the little amount GC was able to clean will be quickly filled again thus forcing GC to restart the cleaning process again.This forms a vicious cycle where the CPU is 100% busy with GC and no actual work can be done. End users of the application are facing extreme slowdowns – operations which used to be completed in milliseconds are now likely to take minutes to finish.
So the “java.lang.OutOfMemoryError: GC overhead limit exceeded” message is a pretty nice example of a fail fast principle in action. Bearing this in mind, disabling this check via -XX:-UseGCOverheadLimit is possibly the worst thing to do. You are just masking the symptoms instead of solving the problem.

A fast way way to give (temporary) relief to GC is to give more memory to the process. Again this is as easy as adding (or increasing if present) just one parameter in your startup scripts (-Xmx) similar to the following example:

java -Xmx1024m com.yourcompany.YourClass
If you wish to understand the underlying cause in more details, see examples and more advanced solutions, see the detailed post about java.lang.OutOfMemoryError: GC overhead limit exceeded error from https://plumbr.eu/outofmemoryerror/gc-overhead-limit-exceeded

soujanyam · ‎2014-08-23

Thanks for your reply. I didn't understand below one
java -Xmx1024m com.yourcompany.YourClass

Where I should add this in .ini fciprt or anything else and what is yourcopmpany and yourclass
Please reply me

Anonymous · ‎2014-08-23

You misunderstood me. It is not a solution to use two tFileList components because all the magic happens in the same job run -> same VM -> same memory. You have to design your job in a way it takes less work, process it and finish and need to be restarted to continue with the next portion of files or what ever you have to process.

soujanyam · ‎2014-08-25

Sorry sir, I could not able to understand.
You have to design your job in a way it takes less work, process it and finish and need to be restarted to continue with the next portion of files or what ever you have to process.
As I said, my requirement is to read the all files from directory in single job and process further.Thats why I used multiple tFilelists with sub directories as input and to join them using tUnite.

Anonymous · ‎2014-08-25

my requirement is to read the all files from directory in single job and process further

but it seems that you cannot ... because of allocated memory at JVM for your job.
Using an image, with your car, you cannot do more miles even if you put gazoline in 2 reservoir

(your tFileLIst)
you can increase Allocated jvm memory to give enought space(heap) for your job ? Could be a solution
As jLolling as already said, process it in several time ... it's the same job (your requirement) but you NOT process ALL the file in the same time but in several times to release the memory between each process.
regards
laurent

soujanyam · ‎2014-08-25

Thanks for your reply laurent I tried like that with only one tFilelist once and then iterate the output with remaining input. but no use

[resolved] OutOfMemoryError: GC overhead limit exceeded in simple MSSqBulkExec jb

Java

REST

SDI

Talend Data Integration

v5.x

XML