Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

[resolved] Hive : Create external table based on a CSV file

Hello, I have searched in the forum and documentation but I didn't see where we specify where the csv is on the hdfs.
I have a .csv file located in my /user/mapr/extrnal_tables/my_file.csv
Hive create the table with the good format since I use a schema, the name of columns are rights etc...
BUT, the data is not there when I only specify in the URI the directory (/user/mapr/extrnal_tables), and when I specify the complete path of the file (/user/mapr/extrnal_tables/my_file.csv) I get this error:
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:maprfs:/user/mapr/external_tables/my_file.csv is not a directory or unable to create one)
The job has 3 steps:
1) FileList -> OK
2) Put to HDFS files from previous job at specified location -> OK
3) Create Hive table -> NOK
EDIT: it's OK now.
I only put a prefix on the file and it magically worked I don't know why.
Labels (3)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

You dont specify the filename in Hive create table statement. Hive only works at the directory level so multiple reducers can quickly write data in to HDFS. If you specify a filename it will have to send the file to one reducer and result in bad performance.

View solution in original post

1 Reply
Anonymous
Not applicable
Author

You dont specify the filename in Hive create table statement. Hive only works at the directory level so multiple reducers can quickly write data in to HDFS. If you specify a filename it will have to send the file to one reducer and result in bad performance.