[resolved] Excel - Spreadsheet with over 200K rows to load into database
I have an Excel Spreadsheet that has over
200K rows that I need to load into database and is
18 columns wide. The data has commas in it. I keep running into "java.lang.OutOfMemoryError: Java heap space"
My job looks like:
tFileInputExcel > tMap > tTeradataOutput
I have tried lowering the "Commit every" to 100. I am not doing any transformations on the data I am just passing it to be loaded.
What else can I do?
If you look at your task manager while executing the job, is the java.exe process taking a lot of memory? Also if you have your statistics enabled, how much data are you loading before reaching the Memory issue?
Also, is your excel file local or on a shared drive?
Is the Teradata Db easily accessible or through a bunch of firewalls?
You could try one thing:
1- Load the Excel into a csv file (tFileOutputDelimited).
|
|
OnSubjobOK
|
|
V
2- Load the csv file into Teradata.
Adding an additional step might help you.
I don't have the Use specific JVM arguments checked. With 4GB of RAM should I check it and use Xms64M, Xmx1024?
Statistics is checked
Save Job before execution is checked
Exec time is checked
Clear before run is checked
If you look at your task manager while executing the job, is the java.exe process taking a lot of memory? Also if you have your statistics enabled, how much data are you loading before reaching the Memory issue?
Also, is your excel file local or on a shared drive?
Is the Teradata Db easily accessible or through a bunch of firewalls?
You could try one thing:
1- Load the Excel into a csv file (tFileOutputDelimited).
|
|
OnSubjobOK
|
|
V
2- Load the csv file into Teradata.
Adding an additional step might help you.