Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have a simple flow to manipulate data from Excel multi sheet which fails since I installed Microsoft Office 2019.
The flow is based on the following sequencial tool :
GoogleDriveConnection -> GoogleDriveGet -> FileInputExcel
To give more details, the google spreadsheet recovered with GoogleDriveGet is composed of many sheets.
GoogleDriveGet tool uses the format MS Excel to save the stream to a file. The resulting saved file has the extension .XLSX. No issue on file saving step.
Then, FileInputExcel tries to read that previous saved file, using excel2007 option and Excel Metadata referencial, and the following issue happens :
------------
Démarrage du job Load_GoogleSheet a 18:24 06/03/2020.
[statistics] connecting to socket on port 3391
[statistics] connected
Exception in component tFileInputExcel_9 (Load_GoogleSheet)
java.io.IOException: org.apache.poi.openxml4j.exceptions.InvalidFormatException: invalid relationship part/id
at org.apache.poi.ooxml.POIXMLDocument.openPackage(POIXMLDocument.java:91)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:336)
at extra_bi.load_googlesheet_0_1.Load_GoogleSheet.tFileInputExcel_9Process(Load_GoogleSheet.java:15234)
at extra_bi.load_googlesheet_0_1.Load_GoogleSheet.tDBConnection_1Process(Load_GoogleSheet.java:947)
at extra_bi.load_googlesheet_0_1.Load_GoogleSheet.runJobInTOS(Load_GoogleSheet.java:24854)
at extra_bi.load_googlesheet_0_1.Load_GoogleSheet.main(Load_GoogleSheet.java:24657)
Caused by: org.apache.poi.openxml4j.exceptions.InvalidFormatException: invalid relationship part/id
at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.parseRelationshipsPart(PackageRelationshipCollection.java:358)
at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:163)
at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:133)
at org.apache.poi.openxml4j.opc.PackagePart.loadRelationships(PackagePart.java:570)
at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:728)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:237)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:166)
at org.apache.poi.ooxml.POIXMLDocument.openPackage(POIXMLDocument.java:89)
... 5 more
[statistics] disconnected
Le Job Load_GoogleSheet sest terminé à {1,date,HH:mm dd/MM/yyyy}. [code de sortie={2,number}]
------------
As I said above, the flow seems to fail since I installed Microsoft Office 2019. It might affect Excel reading step...
I am using Talend Open Studio 7.2.1 on Windows 10 Pro 64bits.
When I open the file recovered by GoogleDriveGet, with my Excel 2019 and save the file, then the FileInputExcel step is working. Metadata Excel are well interpreted.
Do you have any idea of what can fail and how I can solve that ?
Thank you for your help.
Romain
the same issue in v7.3.1
but my office version is office 2016
@Jo Hung , do you also read an Excel file downloaded from Google drive? It seem it is an issue in the export process, I found someone report this issue on google page.
@Shicong Hong , I opened the excel file, made some modification and save again, then my job works fine.
But the same job in v7.1.1 works fine and I do not need to re-save the excel file.