
Anonymous
Not applicable
2016-07-11
06:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[resolved] Hive : Create external table based on a CSV file
Hello, I have searched in the forum and documentation but I didn't see where we specify where the csv is on the hdfs.
I have a .csv file located in my /user/mapr/extrnal_tables/my_file.csv
Hive create the table with the good format since I use a schema, the name of columns are rights etc...
BUT, the data is not there when I only specify in the URI the directory (/user/mapr/extrnal_tables), and when I specify the complete path of the file (/user/mapr/extrnal_tables/my_file.csv) I get this error:
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:maprfs:/user/mapr/external_tables/my_file.csv is not a directory or unable to create one)
The job has 3 steps:
1) FileList -> OK
2) Put to HDFS files from previous job at specified location -> OK
3) Create Hive table -> NOK
EDIT: it's OK now.
I only put a prefix on the file and it magically worked I don't know why.
I have a .csv file located in my /user/mapr/extrnal_tables/my_file.csv
Hive create the table with the good format since I use a schema, the name of columns are rights etc...
BUT, the data is not there when I only specify in the URI the directory (/user/mapr/extrnal_tables), and when I specify the complete path of the file (/user/mapr/extrnal_tables/my_file.csv) I get this error:
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:maprfs:/user/mapr/external_tables/my_file.csv is not a directory or unable to create one)
The job has 3 steps:
1) FileList -> OK
2) Put to HDFS files from previous job at specified location -> OK
3) Create Hive table -> NOK
EDIT: it's OK now.
I only put a prefix on the file and it magically worked I don't know why.
343 Views
1 Solution
Accepted Solutions

Anonymous
Not applicable
2016-07-11
11:05 AM
Author
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You dont specify the filename in Hive create table statement. Hive only works at the directory level so multiple reducers can quickly write data in to HDFS. If you specify a filename it will have to send the file to one reducer and result in bad performance.
1 Reply

Anonymous
Not applicable
2016-07-11
11:05 AM
Author
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You dont specify the filename in Hive create table statement. Hive only works at the directory level so multiple reducers can quickly write data in to HDFS. If you specify a filename it will have to send the file to one reducer and result in bad performance.
