Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
csapparapu
Contributor
Contributor

Spark Big Data job in local mode - Configuring external hive metastore

Hello,

My goal is to run a Spark Big Data batch job using Talend in local mode, no third party clusters or distributions.

I want to save the output to S3, but before that I want to register that data as an external table in a hive metastore.

I would like to use an external hive metastore database. I was able to connect to an external mysql database as the metastore from my spark-shell.

 

I am having trouble on how to set spark.sql.hive.metastore.jars properties in the Run tab's Spark configuration. I couldn't find any information in the documentation.

 

Thanks for looking into this.

Chandana

Labels (2)
2 Replies
manodwhb
Champion II
Champion II

@csapparapu, You can define below way in 

Define the advanced settings
Define Spark advanced settings in the Studio to read Spark 2.0 jar files in your cluster.
Procedure
  1. In the Advanced properties table, to add a row, click the plus symbol (+).
  2. In the Property column, in double quotation marks, enter spark.sql.hive.metastore.jars. This parameter provides the names of jar files to be used by your Spark Job, as well as the paths to them in your cluster.
csapparapu
Contributor
Contributor
Author

@manodwhb,

Thanks for your reply, my question was which jar files to include. I have tried several jar files, as shown below, but still cannot run spark-sql in local mode.

Do you have an example of a spark batch job in local mode with an external hive metastore?

Are these jar files comma separated?

 

"file:///Users/abc/.m2/repository/mysql/mysql-connector-java/8.0.18/mysql-connector-java-8.0.18.jar;file:///Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/hadoop-common-2.8.1/6.0.0/hadoop-common-2.8.1-6.0.0.jar;file:///Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/spark-hive_2.11-2.2.0/6.0.0/spark-hive_2.11-2.2.0-6.0.0.jar;/Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/hadoop-hdfs-2.6.0.2.2.0.0-2041/6.0.0/hadoop-hdfs-2.6.0.2.2.0.0-2041-6.0.0.jar;file:///Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/hive-exec-2.1.0-talend-nolang3/6.0.0/hive-exec-2.1.0-talend-nolang3-6.0.0.jar;file:///Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/hive-jdbc-2.1.0-amzn-0/6.0.0/hive-jdbc-2.1.0-amzn-0-6.0.0.jar"