Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How do I create a directory in HDFS with Talend?

Noob question, so please don't laugh too hard.... :-)
I am creating a date/time based directory structure in HDFS via file insertion from Talend:
YYYY/MM/DD/HH/
So, when I insert files into HDFS, the necessary file structure is automatically created:
inserting file "YYYY/MM/DD/11/data.txt" creates directory "YYYY/MM/DD/11/"
EXCEPT when I happen to be inserting the file at the start of the hour (i.e. 11:00 on the dot).
I'm running into an issue where if I try to create the file at the start of the hour Talend names the file "YYYY/MM/DD/11" (where "11" is the file "data.txt" renamed for some reason) and not "YYYY/MM/DD/11/data.txt". Thus, all my following insert attempts for that hour fail because I am trying to write the files to a directory called "YYYY/MM/DD/11/".
The work around I found was to manually create the directory structure before Talend attempts to insert files.
How do I create a directory in HDFS with Talend so I can remove the manual step?
Labels (2)
2 Replies
Anonymous
Not applicable
Author

>> The work around I found was to manually create the directory structure before Talend attempts to insert files.
For now, I have added the create the directory call (hadoop fs -mkdir ...) as a system call component in Talend,
I would like to know if there is a more elegant solution.....
Anonymous
Not applicable
Author

In the component properties, HDFS Directory, can't you just set up a context variable called context.datedDirectory and each time the job runs it get a different date from a Talend.GetDate function or a parsing of the files you are putting into is and talend will create the directory in HDFS based on this context variable's value each time the job runs?