Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Validate file for duplicate records

Hi,

 

I have a requirement to validate a file, if the file contains duplicate records, discard the file, if file does not contain duplicate process it. Even if we have one duplicate record discard the file.

 

Please help.

 

Thanks,

Pravin Sanadi

Labels (1)
1 Solution

Accepted Solutions
TRF
Champion II
Champion II

Hi,

I think the simple job is:

0683p000009Lu5q.png

The "if" after tUniqRow is based on the value of the global variable tUniqRow_1_NB_DUPLICATES automagically associated to the tUniqRow_1 component (thank's to TDI):

((Integer)globalMap.get("tUniqRow_1_NB_DUPLICATES")) == 0

No tJavaRow required for this use case.

 

Hope this helps,

View solution in original post

10 Replies
Anonymous
Not applicable
Author

 

I think a combination of tUniqRow and tJavaRow should do the trick.

You can connect the duplicate output from tUniqRow to tJavaRow.

Also you would require a context which can act as a flag.

Finally in tJavaRow, Set the context value as true if the input has some duplicates.

E.g

0683p000009Ltwg.png

and in tJavaRow,

0683p000009Lu00.png

Note that in the above screen, the context - duplicateExists is of boolean datatype.

Once this subjob completes, you can use the context in a if condition to decide whether you want to process the file or not.

Anonymous
Not applicable
Author

Hi,

you can use tfilelist and add all the files in a directory and add the directory in the tfilelist, and then pass it through a tfileinputdelimited and then tuniquerow. give a runif condition on tfileinputdelimited as u can get the data's of unique rows and then pass it through tjava giving condition on the no0683p000009MA5A.pngf rows u have in a table, so if any duplicate is there in a table it will not show because we have rejected the record having duplicates using tuniquerow.So the table having less no0683p000009MA5A.pngf rows reject it and the rest is your o/p.

Anonymous
Not applicable
Author

Thanks, So to process the valid file I need to create another job?

Anonymous
Not applicable
Author

u can do that in a single job...but u just need give an if condition in tfileinputdelimited 

Anonymous
Not applicable
Author

0683p000009Lrgu.jpg

 

In tjavarow i have written this code

context.ISVALID=true;
if(input_row.MATERIAL!=null)
context.ISVALID=false;

output_row.ISVALID = context.ISVALID;

 

I need to process if it is valid and discard if not. how to do it?

Anonymous
Not applicable
Author

what are the keys u give in tuniquerow, there are no records passing from tuniquerow!!

Anonymous
Not applicable
Author

so the scenario is if you don't have duplicates process the file forward, if tuniq gives you duplicate report it.

Anonymous
Not applicable
Author

from tJavaRow, you can use a runIf condition and join it with another subjob which will process the file.

0683p000009Ltx0.png

Here in if (order 1) you can mention just context.ISVALID as your condition which means process the file since it contains no duplicates. In If (order 2) you can mention !context.ISVALID which means the file has duplicates.

It's a simple solution but the problem is you will have to read the file again. If your input file size isn't too large, then this solution should work just fine. Otherwise the job run time will increase since we will be reading a huge file twice.

If I can think of something better I will post it here.

TRF
Champion II
Champion II

Hi,

I think the simple job is:

0683p000009Lu5q.png

The "if" after tUniqRow is based on the value of the global variable tUniqRow_1_NB_DUPLICATES automagically associated to the tUniqRow_1 component (thank's to TDI):

((Integer)globalMap.get("tUniqRow_1_NB_DUPLICATES")) == 0

No tJavaRow required for this use case.

 

Hope this helps,