How to connect Hadoop Hive

    This is an example to show how you can connect QlikView with Hadoop Hive by using the JDBC Connector:

     

    http://community.qlik.com/docs/DOC-2438

     

    First make all settings for the JDBC Connector as described.

     

    Then you can download a Cloudera demo VM: https://ccp.cloudera.com/display/SUPPORT/Downloads

     

    After starting the CentOS VM with Hadoop you can create the Beeswax for Hive examples (via web app Hue), which are two tables:

    Cloudera01.png

    Then start the Hive service which is running on default port 10000:

     

    /usr/bin/hive --service hiveserver

     

    Don't forget to find out the IP address of your VM (call ifconfig).

     

    Next steps are on the client side. Extract attached file hive_jdbc-0.7.1.zip to a folder.

     

    This file is a special collection we have made for this purpose. Also, we had to include a file META-INF/services/java.sql.Driver with the driver name org.apache.hadoop.hive.jdbc.HiveDriver into the library hive-jdbc-0.7.1.jar.

     

    Now, add all Java libraries with the full path to the CLASSPATH variable:

     

    hive-jdbc-0.7.1.jar

    hive-exec-0.7.1.jar

    hive-metastore-0.7.1.jar

    hive-service-0.7.1.jar

    hadoop-0.20.0-core.jar

    commons-logging-1.0.4.jar

    log4j-1.2.16.jar

    libfb303.jar

    slf4j-api-1.6.1.jar

    slf4j-log4j12-1.6.1.jar

     

    Now connect to the Hive instance in QlikView and select your table:

     

    CUSTOM CONNECT TO "Provider=JDBCConnector_x64.dll;jdbc:hive://192.168.113.139:10000/default;XUserId=KVPKRRRNPLdIWSJOBDTA;XPassword=EdZQQRRNPLdIWSJOBTYA;";

     

    Cloudera02.png

    See the result in the attached QVW file.

     

    Don't hesitate to ask if you have troubles with this tutorial. Any notes or comments are welcome.

     

    - Ralf

     

    Update: I collected all needed jar files for Hive 0.8.1 which is distributed with the new Cloudera distribution CDH4. You can use this setup:

    JDBCConfig.png

    Update 2: I collected all needed jar files for Hive 0.9 (see attachments). There is still an UTF-8 issue in Hive 0.9. Ask me for a fix.

     

    Update 3: We work hard to figure out if and how we can use JDBC with Cloudera Impala. Maybe something will come up soon.

     

    Update 4: Now we have a first version of a Cloudera Impala JDBC driver (3 month before Cloudera itself will release one) which is using Beeswax API. Just contact me I you need a trial version. Impala has an amazing response time!