Solved: Facing Issues with Spark Context Initialization Us... - Qlik Community

Anonymous · ‎2017-08-24

We have been facing severe issues with connecting to Cloudera Cluster from Talend Big Data Spark Job. We have been getting this error:

Our job is being submitted to spark, but we are wondering if we are missing any spark configuration parameters missing from Talend end.

Talend version using: 6.3.1

Cloudera Version: 5.12

Any suggestions would be of great help.

Thank you

Starting job test_spark at 01:42 24/08/2017.

[statistics] connecting to socket on port 3728
[statistics] connected
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Talend/6.3.1/Talend-Studio-20161216_1026-V6.3.1/Talend-Studio-20161216_1026-V6.3.1/workspace/.Java/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Talend/6.3.1/Talend-Studio-20161216_1026-V6.3.1/Talend-Studio-20161216_1026-V6.3.1/workspace/.Java/lib/talend-spark-assembly-1.6.0-cdh5.8.1-hadoop2.6.0-cdh5.8.1-with-hive.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[WARN ]: org.apache.spark.SparkConf - In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
[WARN ]: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[ERROR]: org.apache.spark.SparkContext - Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
   at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:124)
   at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64)
   at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
   at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
   at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
   at big_data.test_spark_0_1.test_spark.runJobInTOS(test_spark.java:1487)
   at big_data.test_spark_0_1.test_spark.main(test_spark.java:1374)
[WARN ]: org.apache.spark.metrics.MetricsSystem - Stopping a MetricsSystem that is not running
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
   at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:124)
   at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64)
   at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
   at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
   at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
   at big_data.test_spark_0_1.test_spark.runJobInTOS(test_spark.java:1487)
   at big_data.test_spark_0_1.test_spark.main(test_spark.java:1374)
Exception in thread "main" java.lang.RuntimeException: TalendJob: 'test_spark' - Failed with exit code: 1.
   at big_data.test_spark_0_1.test_spark.main(test_spark.java:1384)
[ERROR]: big_data.test_spark_0_1.test_spark - TalendJob: 'test_spark' - Failed with exit code: 1.

Anonymous · ‎2017-08-25

Hi xdshi ,

Yes , it is a spark batch job.

My Cluster is configured correctly . Here I have attached screen shot for reference .

I am able to run map-reduce job using this cluster configuration (I am using cloudera distribution.)

I am facing issue when trying to run Spark batch job.

I have attached screen shots which describe my job, spark configuration, error. Please help me out .

Screenshot (17).png
Screenshot (18).png
Screenshot (19).png
Screenshot (20).png

View solution in original post

Anonymous · ‎2017-08-25

Hello,

It is a spark batch job? Is your cluster correctly configured? Is our connection from the repository? More information will be helpful for to address your issue. Screenshots will be preferred.

Note: Please mask your sensitive data.

Best regards

Sabrina

Anonymous · ‎2017-08-25

Hi xdshi ,

Yes , it is a spark batch job.

My Cluster is configured correctly . Here I have attached screen shot for reference .

I am able to run map-reduce job using this cluster configuration (I am using cloudera distribution.)

I am facing issue when trying to run Spark batch job.

I have attached screen shots which describe my job, spark configuration, error. Please help me out .

Screenshot (17).png
Screenshot (18).png
Screenshot (19).png
Screenshot (20).png

aisbel · ‎2017-11-03

Hi,

I am suffering from the same problem , can zou please tell me how did zou solve it _

Best Regards

aisbel · ‎2017-12-06

Can you please tell us how did you sove this problem

Anonymous · ‎2017-12-11

Hi siddarthaartha,

Could you share your solution to this problem please?

Currently, I am facing the same problem as you have mentioned.

Thanks.

Facing Issues with Spark Context Initialization Using Spark Big Data Batch Job

Big Data

Java

v6.x

Facing Issues with Spark Context Initialization Using Spark Big Data Batch Job

Big Data

Java

v6.x

Related Topics