Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a requirement to validate a file, if the file contains duplicate records, discard the file, if file does not contain duplicate process it. Even if we have one duplicate record discard the file.
Please help.
Thanks,
Pravin Sanadi
Hi,
I think the simple job is:
The "if" after tUniqRow is based on the value of the global variable tUniqRow_1_NB_DUPLICATES automagically associated to the tUniqRow_1 component (thank's to TDI):
((Integer)globalMap.get("tUniqRow_1_NB_DUPLICATES")) == 0
No tJavaRow required for this use case.
Hope this helps,
I think a combination of tUniqRow and tJavaRow should do the trick.
You can connect the duplicate output from tUniqRow to tJavaRow.
Also you would require a context which can act as a flag.
Finally in tJavaRow, Set the context value as true if the input has some duplicates.
E.g
and in tJavaRow,
Note that in the above screen, the context - duplicateExists is of boolean datatype.
Once this subjob completes, you can use the context in a if condition to decide whether you want to process the file or not.
Hi,
you can use tfilelist and add all the files in a directory and add the directory in the tfilelist, and then pass it through a tfileinputdelimited and then tuniquerow. give a runif condition on tfileinputdelimited as u can get the data's of unique rows and then pass it through tjava giving condition on the nof rows u have in a table, so if any duplicate is there in a table it will not show because we have rejected the record having duplicates using tuniquerow.So the table having less no
f rows reject it and the rest is your o/p.
Thanks, So to process the valid file I need to create another job?
u can do that in a single job...but u just need give an if condition in tfileinputdelimited
In tjavarow i have written this code
context.ISVALID=true;
if(input_row.MATERIAL!=null)
context.ISVALID=false;
output_row.ISVALID = context.ISVALID;
I need to process if it is valid and discard if not. how to do it?
what are the keys u give in tuniquerow, there are no records passing from tuniquerow!!
so the scenario is if you don't have duplicates process the file forward, if tuniq gives you duplicate report it.
from tJavaRow, you can use a runIf condition and join it with another subjob which will process the file.
Here in if (order 1) you can mention just context.ISVALID as your condition which means process the file since it contains no duplicates. In If (order 2) you can mention !context.ISVALID which means the file has duplicates.
It's a simple solution but the problem is you will have to read the file again. If your input file size isn't too large, then this solution should work just fine. Otherwise the job run time will increase since we will be reading a huge file twice.
If I can think of something better I will post it here.
Hi,
I think the simple job is:
The "if" after tUniqRow is based on the value of the global variable tUniqRow_1_NB_DUPLICATES automagically associated to the tUniqRow_1 component (thank's to TDI):
((Integer)globalMap.get("tUniqRow_1_NB_DUPLICATES")) == 0
No tJavaRow required for this use case.
Hope this helps,