Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
Currently, I’m struggling with some Spark Job.
I’m just processing data to de-normalize with about 30 tables in SparkSQL(See the sql).
During the job processing, I’m encountering the error below;
[WARN ]: org.apache.spark.scheduler.TaskSetManager - Lost task 74.0 in stage 26.0 (TID 10248, ip-10-118-121-62.ap-northeast-1.compute.internal, executor 12): java.io.IOException: org.apache.spark.SparkException: corrupt remote block broadcast_116_piece0 of broadcast_116: -461336360 != 2000236512
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1350)
・・・(Ommiting)
Caused by: org.apache.spark.SparkException: corrupt remote block broadcast_116_piece0 of broadcast_116: -461336360 != 2000236512
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:167)
・・・(Ommiting)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:211)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1343)
... 35 more
As this message shows, some remote block seems to be corrupted by some known reason..
Can you see the reason for this issue?
Here is the properties and full log for this message.
Please give me some advice on this issue..
Let me add some comments.
And, I added spark parameters.
Regards.