Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I changed TOS_DI-win-x86_64.ini to:
-vmargs
-Xms15120M
-Xmx20480M
-XX:MaxPermSize=18048m
-XX:+UseParallelGC
-Dfile.encoding=UTF-8
My input file has 11.6M rows with 83 columns per row.
My job:
tFileInputDelimited > tFileOutputDelimited
The tFileInputDelimited has Advanced settings CSV options checked.
I am trying to split the file into several smaller output files but the job always fails at 2,402,585 rows.
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.AbstramyJobStringBuilder.ensureCapacityInternal(Unknown Source)
at java.lang.AbstramyJobStringBuilder.append(Unknown Source)
at java.lang.StringBuilder.append(Unknown Source)
at com.talend.csv.CSVReader.readNext(CSVReader.java:288)
at talenddemosjava.myJob_ver3_0_1.myJob_ver3.tFileList_1Process(myJob_ver3.java:7313)
at talenddemosjava.myJob_ver3_0_1.myJob_ver3.runJobInTOS(myJob_ver3.java:14507)
at talenddemosjava.myJob_ver3_0_1.myJob_ver3.main(myJob_ver3.java:14352)
When I change the job to this, it worked successfully:
tFileInputFullRow > tFileOutputDelimited
Is there a max limit on CSV options of 2,402,585 rows?
Thanks vapukov.
I did split into 12 files with 1M, but was still getting the Java Heap Space error.
Next, I split the file into files with 300K, but was still getting the Java Heap Space error.
Then I turned on check each row structure against schema. Turns out there were about 6 rows which had \" in one of the columns which was throwing off the columns.
I changed the Escape char to "\\" then the job ran successfully!
Hi,
you can check
2,402,585*86*size_of_all_columns what your result?
if you already split to several files, did you test with 12 files by 1M? if split why not split for moderate size?
also, is tFileInputDelimited > tFileOutputDelimited it is your full job design? (if no transformation, why you need this job at all?) or you have something between components?
Thanks vapukov.
I did split into 12 files with 1M, but was still getting the Java Heap Space error.
Next, I split the file into files with 300K, but was still getting the Java Heap Space error.
Then I turned on check each row structure against schema. Turns out there were about 6 rows which had \" in one of the columns which was throwing off the columns.
I changed the Escape char to "\\" then the job ran successfully!