Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello
I am trying to get the names of Files that are NOT present in the specified HDFS directory.
(Note: I have multiple files using date mask for this check)
Here is what I thought will work but It's not
If condition --> !((Boolean)globalMap.get("tHDFSExist_1_EXISTS"))
tJava --> System.out.println(((String)globalMap.get("tHDFSExist_1_FILENAME")));
Can anyone help with the solution ?
Can you let me know how the name of files that you need to check if it exists or not on HDFS server come? Can you explain your requirements with more details?
@Rohit Patil, Usually kind of this scenario want to check if one location files are present in other location then you can do the your job way. But here I do not understand why are you checking the file after listing in same directory. it will be always true.
Thanks,
Manohar
Here is the requirement :
I have total 11 files which need to be checked if they are present in HDFS directory. If all are present then we go ahead with other steps & If one or more files is not present, then I have to send those missing file names via email.
Now file names look like this
"officeAddress_"+((String)globalMap.get("currentDate"))
"employeeAddress_"+((String)globalMap.get("currentDate"))
"xyz_"+((String)globalMap.get("previousDate"))
Note : currentDate and previousDate are global variable defined in start of job.
@Shicong Hong
Please ignore the attached solution I added in question body...The logic is wrong :
Here is the requirement :
I have total 11 files which need to be checked if they are present in HDFS directory. If all are present then we go ahead with other steps & If one or more files is not present, then I have to send those missing file names via email.
Now file names look like this
"officeAddress_"+((String)globalMap.get("currentDate"))
"employeeAddress_"+((String)globalMap.get("currentDate"))
"xyz_"+((String)globalMap.get("previousDate"))
Note : currentDate and previousDate are global variable defined in start of job.
@Manohar B
Assuming the 11 file names are read from somewhere in the start of job, you need to iterate the rows and pass the current file name to tHDFSExist, for example:
tFileInputDelimited--main--tFlowToITerate--Iterate-->tHDFSXist--runIf---tJava
@Shicong Hong I have developed the half part --> Which checks if all 11 files are present or not..Next part to find the names of missing files is the challenge.
Can you elaborate more
runIf---tJava ?
Yes, as you did in the first post,
....runIf---tjava
on tJava: print the file name on the console for debugging.
If you want to output the file name to somewhere, use tFixedFlowInput to generate the file name, eg:
runIf--tFixedFlowInput--main--tFileOutputDelimited.