Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
My goal is to run a Spark Big Data batch job using Talend in local mode, no third party clusters or distributions.
I want to save the output to S3, but before that I want to register that data as an external table in a hive metastore.
I would like to use an external hive metastore database. I was able to connect to an external mysql database as the metastore from my spark-shell.
I am having trouble on how to set spark.sql.hive.metastore.jars properties in the Run tab's Spark configuration. I couldn't find any information in the documentation.
Thanks for looking into this.
Chandana
@csapparapu, You can define below way in
Thanks for your reply, my question was which jar files to include. I have tried several jar files, as shown below, but still cannot run spark-sql in local mode.
Do you have an example of a spark batch job in local mode with an external hive metastore?
Are these jar files comma separated?
"file:///Users/abc/.m2/repository/mysql/mysql-connector-java/8.0.18/mysql-connector-java-8.0.18.jar;file:///Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/hadoop-common-2.8.1/6.0.0/hadoop-common-2.8.1-6.0.0.jar;file:///Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/spark-hive_2.11-2.2.0/6.0.0/spark-hive_2.11-2.2.0-6.0.0.jar;/Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/hadoop-hdfs-2.6.0.2.2.0.0-2041/6.0.0/hadoop-hdfs-2.6.0.2.2.0.0-2041-6.0.0.jar;file:///Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/hive-exec-2.1.0-talend-nolang3/6.0.0/hive-exec-2.1.0-talend-nolang3-6.0.0.jar;file:///Applications/TalendStudio-7.2.1/studio/configuration/.m2/repository/org/talend/libraries/hive-jdbc-2.1.0-amzn-0/6.0.0/hive-jdbc-2.1.0-amzn-0-6.0.0.jar"