Skip to main content
Announcements
July 15, NEW Customer Portal: Initial launch will improve how you submit Support Cases. IMPORTANT DETAILS
cancel
Showing results for 
Search instead for 
Did you mean: 
rohitpatil1993
Contributor II
Contributor II

List File names that are NOT present in HDFS Directory

Hello

I am trying to get the names of Files that are NOT present in the specified HDFS directory.

(Note: I have multiple files using date mask for this check)

Here is what I thought will work but It's not

0695b00000G5UuXAAV.png

If condition --> !((Boolean)globalMap.get("tHDFSExist_1_EXISTS"))

tJava --> System.out.println(((String)globalMap.get("tHDFSExist_1_FILENAME")));

Can anyone help with the solution ?

Labels (3)
16 Replies
Anonymous
Not applicable

Can you let me know how the name of files that you need to check if it exists or not on HDFS server come? Can you explain your requirements with more details?

 

 

 

manodwhb
Creator III
Creator III

@Rohit Patil​, Usually kind of this scenario want to check if one location files are present in other location then you can do the your job way. But here I do not understand why are you checking the file after listing in same directory. it will be always true.

 

 

Thanks,

Manohar

 

rohitpatil1993
Contributor II
Contributor II
Author

Here is the requirement :

I have total 11 files which need to be checked if they are present in HDFS directory. If all are present then we go ahead with other steps & If one or more files is not present, then I have to send those missing file names via email.

 

Now file names look like this

 

"officeAddress_"+((String)globalMap.get("currentDate"))

"employeeAddress_"+((String)globalMap.get("currentDate"))

"xyz_"+((String)globalMap.get("previousDate"))

 

Note : currentDate and previousDate are global variable defined in start of job.

@Shicong Hong​ 

rohitpatil1993
Contributor II
Contributor II
Author

Please ignore the attached solution I added in question body...The logic is wrong :

 

Here is the requirement :

I have total 11 files which need to be checked if they are present in HDFS directory. If all are present then we go ahead with other steps & If one or more files is not present, then I have to send those missing file names via email.

 

Now file names look like this

 

"officeAddress_"+((String)globalMap.get("currentDate"))

"employeeAddress_"+((String)globalMap.get("currentDate"))

"xyz_"+((String)globalMap.get("previousDate"))

 

Note : currentDate and previousDate are global variable defined in start of job.

@Manohar B​ 

 

Anonymous
Not applicable

Assuming the 11 file names are read from somewhere in the start of job, you need to iterate the rows and pass the current file name to tHDFSExist, for example:

tFileInputDelimited--main--tFlowToITerate--Iterate-->tHDFSXist--runIf---tJava

 

 

rohitpatil1993
Contributor II
Contributor II
Author

@Shicong Hong​ I have developed the half part --> Which checks if all 11 files are present or not..Next part to find the names of missing files is the challenge.

 

Can you elaborate more

runIf---tJava ?

 

Anonymous
Not applicable

Yes, as you did in the first post,

....runIf---tjava

on tJava: print the file name on the console for debugging.

If you want to output the file name to somewhere, use tFixedFlowInput to generate the file name, eg:

runIf--tFixedFlowInput--main--tFileOutputDelimited.