Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Just wanted to put a thought out there before I submit a feature request:
One of our newbies was having difficulty because the input file he was trying to use in a job had unnecessary spaces trailing the data for a column defined in the schema as Integer, which caused every row to be rejected. It seems to me that it would be appropriate for all input components to automatically trim any numeric column before attempting to parse it to the correct data type, regardless of the advanced trim settings.
Comments, please?
at least produce a very clear exception during parsing about the reason.
//grab fields with reflection magic
java.lang.reflect.Field[] input_fields = input_row.getClass().getDeclaredFields();
//now we go through the fields
for( java.lang.reflect.Field field : input_fields ) {
//set accessable so we dont explode with exception.
field.setAccessible(true);
//skip the silly byte arrays
if(field.getName() == "commonByteArrayLock" || field.getName() == "commonByteArray" ) {
continue;
}
//retrieve the field so we can play with it
Object col_value = field.get(input_row);
//Now we populate output schema
//this should feel familiar 🙂
java.lang.reflect.Field[] output_fields = output_row.getClass().getDeclaredFields();
for ( java.lang.reflect.Field output_field : output_fields ) {
output_field.setAccessible(true);
//apply the transform
col_value = routines.StringHandling.TRIM(((String) col_value));
//if these are matching schema columns, set the value of output to our changed value.
if(field.getName() == output_field.getName()) {
output_field.set(output_row, col_value);
}
}
}
Shong is right here... this would require changes to all the input components.
public static int parseTo_int(String s) {
if (s!=null) {
s = s.trim();
}
return Integer.parseInt(s);
}