Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All.
I want to split a giant csv file into several smaller files according to the first three characters in the row. I have the following:
tFileInputFullRow --(row1)--> tJavaRow --(row2)-->tFileOutputRaw
* rFileInputFullRow reads each line into a "line" column
* tJavaRow reads:
       globalMap.put("rowType", row1.line.substring(1,4));
       output_row.line = input_row.line;
* tFileOutputRaw has as filename "path/to"+((String)globalMap.get("rowType"))+".csv"
All I get as a result is a null.csv file staring back at me.
However, when I do:
tFileInputFullRow --(row1)--> tMap -(row2)-> tFlowToIterate --(iterate)-->tFixedFlowInput-->tFileOutputDelimited
with
* tMap adding a new column type to row2 which is defined just as in the tJavaRow above
* tFileOutputDelimited has the same name as tFileOutputRaw.
This time I do get the different files created!!
Why does this happen? I'm asking this because I'm seeing that the first solution goes much quicker than the latter (mainly because it doesnt have to iterate each of the 50 columns for each of the 600.000+ rows).
cheers
David
Even though the global variable is changed? (I have added another tJavaRow just to print the value of the global var and the value effectively changes each row...)