Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Jan
I'm trying to use tFileExcelWorkbookOpen and tFileExcelSheetInput to load a 17Mb xlsx 250k rows xlsx file, but get a GC error.
The file is different to some others of similar size which do load, so it looks like the routines aren't processing it quite right. When I do a file <filename> it indicates its a Microsoft OOXML file, whereas the others are Microsoft Excel 2007+
Would you expect to be able to read this file, or should I convert it by opening in Excel and re-writing it?
I sent that too soon, it appears the other format has the same problem, even with mx:4g.
Any chance you can modify the code to allow streaming?
The tFileInputExcel component doesn't exhibit this problem, but has a problem formatting dates consistently between the OOXML and the Excel format files, which your component resolved for smaller files.
Hello,
Have you tried to set up Excel metadata with 'Read excel2007 file format' when use tFileInputExcel component to read your files?
Best regards
Sabrina
Hi,
I'm guessing downstream of the t{something}Input component., you're doing something with the data, could it be that is the issue?
For example, on a tMap, putting something into the "Temp data directory path" and putting the "Max buffer size" to something sensible.
Thanks
David
No the problem can be repeated even with just the tFileExcelWorkbookOpen component in a job which isn't great as there are no options to try to solve it. Even with -Xmx2048M it fails, so something is wrong given the xlsx is less than 20Mb. It grinds away for about a minute then raises the GC overhead limit exceeded exception.
Yes had that set for both formats yet the OOXML generated dd/mm/yyyy and the Microsoft Excel 2007+ format was m/d/yy. Both were being displayed as Date dd/mm/yyyy when you opened them in Excel or Libre Office. I'm going to have to get the component to throw an exception if the wrong file format is supplied and get it converted.
Slightly different solution to consider, use a freely available Java library to extract the data for a given sheet out of the XLS into a CSV and then import the CSV.
Thanks
David