Hi, I have another question from data validation. That is, the input file must contain one and only one row with a particular field value. Say, a input file must contain one and only one row of field data "A_Value" for column "A". I am thinking to use tAggregateRow to get all counts for column A and then check it in tJavaRow. but tJavaRow only processes one row. In this case, I need to get the list of all column counts to check if the list has the one and only row for that column A. Is there any way to do this ? Thanks ! James
Can you clarify? I am not sure if you are saying all rows must contain value "A_Value" in column A like this: A A_Value A_Value A_Value Or whether you are saying that there can only be one row with "A_value" like this A A_Value B_Value C_Value I am a bit slow today but still trying to help
Thanks for your help. It is the second case. The columnA could contain any data A_Value, B_Value,BB_value,CCC,.... but the validation requires it must contain the A_value and only allows one. James
You *could* use aggregate to find the count and verify the count = 1 for that value. You don't say what you want to have happen if the validation fails. Here is a job that will die if validation fails. This leverages the "run if" capability of talend as well as the exported variable from tFilterRow, NB_LINE_OK. Job looks like tFixedFlowInput -> tFilterRow -> tLogRow - run if trigger -> tDie tFixedFlowInput generates problem data like this: A "A_Value" "A_Value" "B_Value" "C_Value" tFilterRow filters where A Equals "A_Value" tLogRow just logs all the A_Values - this is not important for the job but useful for debugging after tLogRow create a RunIf trigger to a tDie component My RunIf condition looks like: (Integer)globalMap.get("tFilterRow_1_NB_LINE_OK") != 1 My tDie message looks like: "one and only one A_value required found " + globalMap.get("tFilterRow_1_NB_LINE_OK") When I run it I get one and only one A_value required found 2 .-------. |tLogRow_1| |=-----=| |A | |=-----=| |A_Value| |A_Value| '-------' Let me know what questions you have
Thanks for your reply.
After it fails, I need to log the file into a table (record rejected files). If it is valid, it will be continued to normal parsing process.
I am thinking of using tAggregate because there are multiple columns to be verified. Some columns could not exceeds some number, say 5 rows.
Yes, the tFilter is an option but this needs multiple filters. I wonder if I could use one tJavaRow to validate all of columns. If I could get the list of all counts, I could 1) check if the particular value exists and 2) check its counts if it exists by loop them all. This could check multiple columns in one loop.
The question is : is there a way to get a the list of tJava ? If not, is there way to validate multiple column by one process instead of multiple processes(filters) ?
James
Yes, you can use tJavaFlex to increment a number of variables based on your different test conditions for each row and then RunIf to test the results and determine what action to take e.g.
tJavaFlex Start code Integer count1 = 0;
Integer count2 = 0;
tJavaFlex Main code count1 += A_Value.equals(row1.A)?1:0;
count2 += B_Value.equals(row1.B)?1:0;
RunIf count1 == 1 && count2 <= 5
This is what I exactly wanted. Thanks a lot.
By the way, the "count1 += A_Value.equals(row1.A);" did not work. It give the error of "The operator += is undefined for the argument type(s) int, boolean". But I got what you meant, I just and "if" block.
Thanks again,
James