Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I am not able to run a Spark Big Data Batch job using Spark 2.3 or Spark 2.4 version locally.
I would like to know if Talend Big Data Platform V7.X works with Spark 2.3 or Spark 2.4 in local mode.
I get the following exception. Spark 2.2 version works fine.
java.lang.IllegalArgumentException at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source) at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source) at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source) at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:46) at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:449) at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:432) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103) at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:103) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:432) at org.apache.xbean.asm5.ClassReader.a(Unknown Source) at org.apache.xbean.asm5.ClassReader.b(Unknown Source) at org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at org.apache.xbean.asm5.ClassReader.accept(Unknown Source) at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:262) at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:261) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:261) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159) at org.apache.spark.SparkContext.clean(SparkContext.scala:2292) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2066) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2092) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:939) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at org.apache.spark.rdd.RDD.collect(RDD.scala:938) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.mergeSchemasInParallel(ParquetFileFormat.scala:612) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.inferSchema(ParquetFileFormat.scala:241) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:202) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:202) at scala.Option.orElse(Option.scala:289) at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:201) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:392) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:620) at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:604)
Thanks,
Chandana
@csapparapu , I have use Talend Cloud Real-time Big Data Platform,where 2.3 is working in local mode.
@manodwhb, thanks for your response.
After spending a lot of time on this, I realized the problem is with Talend Big Data platform.
I've raised a bug on TalendForge.
So the bug I opened was rejected and I dug deeper to see what's wrong with my settings.
Apparently there is nothing wrong with my settings, just one component, tLogRow doesn't work.
When I used other components, it was working fine with Spark 2.4.
The following job was successful.