Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Use case is reading in a pipe delimited file via tInputFIleDelimited that hasn't got the best hygiene. What I want to be able to do is strip whitespace, double quotes, tabs, newlines, etc.
tReplace is clearly capable of this, but the catch is this: I need to do this in at least two workflows with two different schemas of around 40 columns apiece, and potentially in the future with longer rows than that.
If there's not a component that would be suitable, does Talend's internal structure for a row have an iterator I could use to iterate over the row in a tJavaRow and write some freeform code to do the replace? Pretty comfortable programming this out I'm just not too familiar with what Talend makes available to work with.
Thanks!
Hi,
As far as I know a Talend row is a normal Java Class.
The attributes of the row are "hard coded" by reflection as variables of the class, so
you can use java reflection for this and encapsulate the functionality in a user routine:
public static void cleanRow(Object row) throws Exception {
String fieldName ="";
for (Field field: row.getClass().getFields()) {
fieldName = field.getName();
System.out.println("processing field: " + fieldName);
if (field.getType().equals(String.class) ) {
String value = (String) field.get(row);
if (value != null){
//do your changes here
value = value.trim(); //e.g. trim
field.set(row, value);
}//if2
}//if1
}//for
}//method
Thanks for the insight @Bernhard Gruber, I'll give this approach a try and see what I come up with!
Let me know if it works for you. It is also possible to set the Talend context variables with java reflection.
I wrote my own context loader this way, because the built in Talend functionality was not enough for me.
I also use this method to autoPropagate fields to copy the input_row to the output_row of a tJavaRow (which does not provide this feature).