Skip to main content
Announcements
SYSTEM MAINTENANCE: Thurs., Sept. 19, 1 AM ET, Platform will be unavailable for approx. 60 minutes.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to push data from HDFS to Hive Table

Hi All,

I am not able to push data from HDFS to Hive .
Objective :
Create tables and load data into Hive.

Platform :
CDH 4.4 .
TOS for Big Data 5.4.0
Ubuntu OS.

Architecture :
Both CDH and TOS are on same machine(CDH with Single host implementation)
Components being used are :
tHiveConnections
tHiveCreateTable
tHiveLoad
More info :
Screen shot.

Labels (2)
6 Replies
Anonymous
Not applicable
Author

Error Log:
Starting job Test at 06:53 14/11/2013.

connecting to socket on port 4047
connected
: org.apache.hadoop.hive.conf.HiveConf - hive-site.xml not found on CLASSPATH
: org.apache.hadoop.conf.Configuration - mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
: org.apache.hadoop.conf.Configuration - mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
: org.apache.hadoop.conf.Configuration - mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
: org.apache.hadoop.conf.Configuration - mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
: org.apache.hadoop.conf.Configuration - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces : org.apache.hadoop.conf.Configuration - mapred.reduc

To see the whole post, download it here
OriginalPost.pdf
Anonymous
Not applicable
Author

You just have a permission issue within HDFS.
Please change that using hadoop fs -chmod and haddop fs -chown
Anonymous
Not applicable
Author

Hi Remy ,
can you please suggest me on the components i should be using to write data directly from hDFS to Hive table. I tried with tHiveLoad which upload data from file and not from HDFS file . Can you please guide me . Currently i am pushing or writing data to a file on linux box and then fetch that to hive tables using tHiveload.
Anonymous
Not applicable
Author

Hi,
If your data is already on HDFS, then I think you can use the tHDFSCopy to move the data from the current location to the HDFS location of your Hive table.
Does it make sense?
Anonymous
Not applicable
Author

Hi Remy,
Makes sense . Thanks a lot. Can you do me another favor . Can you suggest any better BI engine which would fit better with Talend Open Studio for Big Data. Currently , i am trying with SpagoBI and find it pretty confusing and complex. Thanks.
Anonymous
Not applicable
Author

Hi 
I think problem in your tHiveLoad component.Your local has been checked.Checked it out and give the hdfs file path,it will load data to your Hive Table directly.