Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
badri-nair
Contributor

How to set the "org.apache.spark.serializer.KryoSerialize

HI, 

I am using the Talend Cloud BigData platform Version 7.1.1

In the map there is a parquet file which is reading a field, which has xml value. its quite large (12kb) per field.

The code fails . with the below error.

How do i set the Customize Spark serialiser option "org.apache.spark.serializer.KryoSerialize"   ??

what is the value that i need to put in the box to bump up the Memory ?

 

ERROR message: 

#############################################################################################

Caused by: org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 12264
Serialization trace:
xmldata (t_data.t_data_staging_flight_passenger_0_1.row1Struct). To avoid this, increase spark.kryoserializer.buffer.max value.
at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:318)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:383)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 12264

#############################################################################################

Labels (5)
1 Reply
badri-nair
Contributor
Author

Found the solution by myself.

Edit the hadoop cluster connection under metadata (values needs to be unexported)

Click on the use spark configuration button.

THere you can enter key value pairs . insert a row and ener the value as in the screen shot . It worked for me 

0683p000009M2NP.jpg