Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Save $650 on Qlik Connect, Dec 1 - 7, our lowest price of the year. Register with code CYBERWEEK: Register
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

tFileInputExcel_1 max. ratio of compressed file size to the size of the expanded data (Zip bomb detected!)

I have a job that reads an Excel file using tFileInputExcel, reads some information on it, enriches the information and writhe in the same excel file again. Everything was working as expected while the excel file had 30 lines. Now that I have an Excel file with 400 lines I am getting the following Exception:

 

 

Exception in component tFileInputExcel_1 (myjob)
org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException
at org.apache.poi.POIXMLFactory.createDocumentPart(POIXMLFactory.java:63)
at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:604)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:186)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:266)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:336)
at jobName.myjob.tFileInputExcel_1Process(myjob.java:15007)
at jobName.myjob.tFileInputExcel_2Process(myjob.java:7404)
at jobName.myjob.tFileExcelWorkbookOpen_1Process(myjob.java:640)
at jobName.myjob.runJobInTOS(myjob.java:19176)
at jobName.myjob.main(myjob.java:19011)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:56)
at org.apache.poi.POIXMLFactory.createDocumentPart(POIXMLFactory.java:60)
... 9 more
Caused by: java.io.IOException: Zip bomb detected! The file would exceed the max. ratio of compressed file size to the size of the expanded data. This may indicate that the file is used to inflate memory usage and thus could pose a security risk. You can adjust this limit via ZipSecureFile.setMinInflateRatio() if you need to work with files which exceed this limit. Counter: 827247, cis.counter: 8192, ratio: 0.009902725546299956Limits: MIN_INFLATE_RATIO: 0.01
at org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.advance(ZipSecureFile.java:270)
at org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.read(ZipSecureFile.java:221)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(XMLEntityManager.java:2919)
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:302)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1895)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanQName(XMLEntityScanner.java:843)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:193)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
at org.apache.poi.util.DocumentHelper.readDocument(DocumentHelper.java:140)
at org.apache.poi.POIXMLTypeLoader.parse(POIXMLTypeLoader.java:143)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.StyleSheetDocument$Factory.parse(Unknown Source)
at org.apache.poi.xssf.model.StylesTable.readFrom(StylesTable.java:194)
at org.apache.poi.xssf.model.StylesTable.<init>(StylesTable.java:145)
... 15 more

 

The file is not that big, around 350k. According to the own exception, it would be required to call apache POI API ZipSecureFile.setMinInflateRatio(). 

 

Does anyone know how to solve this? Is there a way to make the Talend component set this on the Apache POI library? 

 

more information about this error:

https://stackoverflow.com/questions/44897500/using-apache-poi-zip-bomb-detected 

https://stackoverflow.com/questions/46814489/understanding-zipsecurefile-setmininflateratiodouble-ra...

 

Labels (4)
3 Replies
Anonymous
Not applicable
Author

According to the first Stack Overflow post that you linked, this can be caused when the actual Excel data is so similar that is compresses to almost nothing. If you're only going to be dealing with 64k rows or fewer, my first thought is to use an .xls file instead (i.e. the old Excel 2003 format, which isn't compressed).

 

If that's not possible, open the Excel spreadsheet, copy the data to a new sheet, delete the old sheet, and save the file: this will reset Excel's internal counters to point only to your data, rather than cells where data has been previously stored (Excel sometimes "remembers" where data has been, which is why a spreadsheet that once contained 500k rows may still scroll down to row 500k after most of the data in these rows has been deleted; if Excel is treating these extra rows as containing nulls or the null string, then the compression algorithm used to create the .xlsx file will be compressing data that is almost entirely identical, which would cause the error you are seeing).

 

Hope this helps!

Anonymous
Not applicable
Author

Thank you DVSCHWAB, 

 

as a workaround, I have removed some empty columns on the file to decrease the Excel's internal counters. It is still not a valid solution in my scenario as the final user in a data integration solution that uploads those files. 

 

I will give a try using the tFileExcelSheetInput third-party component, as it seems to set the ratio to 0 in APACHE POI.

 

https://github.com/jlolling/talendcomp_tFileExcel/blob/6c8baba662db57dc77fb7bb7cf2a1008a54df8fc/src/...

Anonymous
Not applicable
Author

You can add a tJava before the FileExcel component, and configure this ratio there.

 

org.apache.poi.openxml4j.util.ZipSecureFile.setMinInflateRatio(-1.0d);