Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
Is it possible to get all files in HDFS directory and save it as a single file / multiple files on a local machine?
I've want to extract files from a directory with regular expressions but it doesn't seem to work. But it says on the documentation here (https://help.talend.com/reader/g8zdjVE7fWNUh3u4ztO6Dw/PUKLf_wAqRMmwe4w~Lw1wA) that regular expressions is supported in filemasks.
I'm basically trying to grab files that match: ".+part-.*" inside a directory (iterating through subdirectories).
These files are the output from the tFileOutputDelimited from a Spark Streaming job.
Thank you.
Have you tried tHDFSList? You can specify a filemask (glob or regex) with this and iterate through files / directories / subdirectories of a specific hdfs location. You could then pass the global variable
((String)globalMap.get("tHDFSList_1_CURRENT_FILEPATH"))
to the "HDFS directory" property of tHDFSGet
Have you tried tHDFSList? You can specify a filemask (glob or regex) with this and iterate through files / directories / subdirectories of a specific hdfs location. You could then pass the global variable
((String)globalMap.get("tHDFSList_1_CURRENT_FILEPATH"))
to the "HDFS directory" property of tHDFSGet
Thank you! I happened to stumble across an example at the bottom part of the documentation too. Didn't know that the autocomplete also works on the component fields.