Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

ExecutorLostFailure in talend spark job when running job in yarn client mode

Talend Spark job getting failed while we process Large xml message with size 1.2 GB.

 

we have two core nodes with 64GB each  and one master node with 64GB.

we are running the job in Yarn client mode, with attached spark configuration.

 

[ERROR]: org.apache.spark.internal.io.SparkHadoopWriter - Aborting job job_id.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in stage 0.0 failed 4 times, most recent failure: Lost task 6.3 in stage 0.0 (TID 17, ip-XX-XXX-XXX-XXX.XXXXX.com, executor 6): ExecutorLostFailure (executor 6 exited caused by one of the running tasks) Reason: Container marked as failed: container_id on host: ip-XX-XXX-XXX-XXX.XXXXX.com. Exit status: 50. Diagnostics: Exception from container-launch.
Container id: container_id
Exit code: 50
Stack trace: ExitCodeException exitCode=50:

 

Please let me know your inputs

Labels (2)
3 Replies
Anonymous
Not applicable
Author

Hello,

Can you please clarify in which Talend bigdata version/edition you are?

Here is a jira issue:https://jira.talendforge.org/browse/TBD-4872 and it is fixed in V 6.4.1

Best regards

Sabrina

Anonymous
Not applicable
Author

Thanks for your reply.

We are using 7.1.1
Anonymous
Not applicable
Author

We are using 7.1 version of Bigdata studio.
The same job is working with smaller load when we increase the number of records To 1.2 GB file then this error occurred.

Seems we are not configuring the spark configuration properly.
Need to understand spark properties in detail.
Based on our EMR cluster configurations, how can we calculate what are the executor memory, driver memory, memory overhead, number of instances, number of cores, and any other spark properties.

Please post some links or tutorials to configure these properties in talend and also any best practices to performance tuning