Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

[resolved] tImpalaInput - java.lang.ClassNotFoundException: org.apache.hadoop.hiv

I'm using CDH 5.2.0 with Impala 2.0.0+cdh5.2.0+0 and Hive 0.13.1+cdh5.2.0+221.  I'm able to successfully run this query on this Impala cluster using Hue but unable to do so using Talend Open Studio for Big Data 5.6.0.20141024_1545 - I am using the tImpalaInput component to run the query and my cluster does have Kerberos enabled:
Query:  select code, sum(salary) as salarysum from sample_07 group by code order by code;
Error from Talend:
Starting job TOS_ImpalaTesting at 09:53 18/12/2014.
connecting to socket on port 3993
connected
: org.apache.hadoop.util.Shell - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:324)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:339)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:332)
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.findHadoopBinary(HiveConf.java:918)
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:228)
at org.apache.hive.jdbc.HiveConnection.isHttpTransportMode(HiveConnection.java:304)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:181)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:164)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.tImpalaConnection_1Process(TOS_ImpalaTesting.java:354)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.runJobInTOS(TOS_ImpalaTesting.java:1047)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.main(TOS_ImpalaTesting.java:904)
disconnected
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/shims/ShimLoader
at org.apache.hive.service.auth.KerberosSaslHelper.getKerberosTransport(KerberosSaslHelper.java:68)
at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:250)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:181)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:164)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.tImpalaConnection_1Process(TOS_ImpalaTesting.java:354)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.runJobInTOS(TOS_ImpalaTesting.java:1047)
at hadooptesting.tos_impalatesting_0_1.TOS_ImpalaTesting.main(TOS_ImpalaTesting.java:904)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.shims.ShimLoader
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 10 more
Job TOS_ImpalaTesting ended at 09:53 18/12/2014.

I did find this JIRA that mentions this (or a similar issue) is fixed in Hive 0.14 (which was recently released).
Any help would be appreciated.  Screenshots of my components and process attached below.  Thank you.
0683p000009MBnX.png 0683p000009MC4C.png 0683p000009MC4H.png
0683p000009MBgw.png
Labels (4)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

The first error is not really an error, it happens all over the place when running Hadoop on Windows, and is an upstream Hadoop issue. The second issue is because you are using CDH5.2 (Impala 2.0) which is not currently supported by the Talend components. Hadoop/Cloudera/Horton are all super picky about the libs and versions being used. They need to be correct and match the cluster versions. In order to connect to Impala 2.0 on CDH5.2 you will need to use the hive-jdbc-0.13.0.jar or the Cloudera one, neither of which  is included in the components in Talend 5.6 (it also does not appear to include the hive-exec dependency which is a bug in the component but wouldn't save you 0683p000009MACn.png). You can either use a version of CDH that is supported (5.1) or update the components yourself to include the correct libs (hive-jdbc-0.13.x.jar and hive-exec-0.13.x.jar) Welcome to the Hadoop arms race. 0683p000009MACn.png
http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_jdbc....

View solution in original post

2 Replies
Anonymous
Not applicable
Author

The first error is not really an error, it happens all over the place when running Hadoop on Windows, and is an upstream Hadoop issue. The second issue is because you are using CDH5.2 (Impala 2.0) which is not currently supported by the Talend components. Hadoop/Cloudera/Horton are all super picky about the libs and versions being used. They need to be correct and match the cluster versions. In order to connect to Impala 2.0 on CDH5.2 you will need to use the hive-jdbc-0.13.0.jar or the Cloudera one, neither of which  is included in the components in Talend 5.6 (it also does not appear to include the hive-exec dependency which is a bug in the component but wouldn't save you 0683p000009MACn.png). You can either use a version of CDH that is supported (5.1) or update the components yourself to include the correct libs (hive-jdbc-0.13.x.jar and hive-exec-0.13.x.jar) Welcome to the Hadoop arms race. 0683p000009MACn.png
http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_jdbc....
Anonymous
Not applicable
Author

jholman - Thank you for the input.  I thought that might be the case based on the error and the Hive JIRA I found. I also replicated the same functionality with the same setup using tHive components and did not run into any issues.
I appreciate your help!