<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to set the &amp;quot;org.apache.spark.serializer.KryoSerialize in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/How-to-set-the-quot-org-apache-spark-serializer-KryoSerialize/m-p/2235953#M24835</link>
    <description>&lt;P&gt;HI,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using the Talend Cloud BigData platform Version 7.1.1&lt;/P&gt;
&lt;P&gt;In the map there is a parquet file which is reading a field, which has xml value. its quite large (12kb) per field.&lt;/P&gt;
&lt;P&gt;The code fails . with the below error.&lt;/P&gt;
&lt;P&gt;How do i set the Customize Spark serialiser option "org.apache.spark.serializer.KryoSerialize"&amp;nbsp; &amp;nbsp;??&lt;/P&gt;
&lt;P&gt;what is the value that i need to put in the box to bump up the Memory ?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ERROR message:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;#############################################################################################&lt;/P&gt;
&lt;P&gt;Caused by: org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 12264&lt;BR /&gt;Serialization trace:&lt;BR /&gt;xmldata (t_data.t_data_staging_flight_passenger_0_1.row1Struct). To avoid this, increase spark.kryoserializer.buffer.max value.&lt;BR /&gt;at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:318)&lt;BR /&gt;at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:383)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:748)&lt;BR /&gt;Caused by: com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 12264&lt;/P&gt;
&lt;P&gt;#############################################################################################&lt;/P&gt;</description>
    <pubDate>Sat, 16 Nov 2024 06:42:05 GMT</pubDate>
    <dc:creator>badri-nair</dc:creator>
    <dc:date>2024-11-16T06:42:05Z</dc:date>
    <item>
      <title>How to set the "org.apache.spark.serializer.KryoSerialize</title>
      <link>https://community.qlik.com/t5/Talend-Studio/How-to-set-the-quot-org-apache-spark-serializer-KryoSerialize/m-p/2235953#M24835</link>
      <description>&lt;P&gt;HI,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using the Talend Cloud BigData platform Version 7.1.1&lt;/P&gt;
&lt;P&gt;In the map there is a parquet file which is reading a field, which has xml value. its quite large (12kb) per field.&lt;/P&gt;
&lt;P&gt;The code fails . with the below error.&lt;/P&gt;
&lt;P&gt;How do i set the Customize Spark serialiser option "org.apache.spark.serializer.KryoSerialize"&amp;nbsp; &amp;nbsp;??&lt;/P&gt;
&lt;P&gt;what is the value that i need to put in the box to bump up the Memory ?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ERROR message:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;#############################################################################################&lt;/P&gt;
&lt;P&gt;Caused by: org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 12264&lt;BR /&gt;Serialization trace:&lt;BR /&gt;xmldata (t_data.t_data_staging_flight_passenger_0_1.row1Struct). To avoid this, increase spark.kryoserializer.buffer.max value.&lt;BR /&gt;at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:318)&lt;BR /&gt;at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:383)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:748)&lt;BR /&gt;Caused by: com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 12264&lt;/P&gt;
&lt;P&gt;#############################################################################################&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 06:42:05 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/How-to-set-the-quot-org-apache-spark-serializer-KryoSerialize/m-p/2235953#M24835</guid>
      <dc:creator>badri-nair</dc:creator>
      <dc:date>2024-11-16T06:42:05Z</dc:date>
    </item>
    <item>
      <title>Re: How to set the "org.apache.spark.serializer.KryoSerialize</title>
      <link>https://community.qlik.com/t5/Talend-Studio/How-to-set-the-quot-org-apache-spark-serializer-KryoSerialize/m-p/2235954#M24836</link>
      <description>&lt;P&gt;Found the solution by myself.&lt;/P&gt; 
&lt;P&gt;Edit the hadoop cluster connection under metadata (values needs to be unexported)&lt;/P&gt; 
&lt;P&gt;Click on the use spark configuration button.&lt;/P&gt; 
&lt;P&gt;THere you can enter key value pairs . insert a row and ener the value as in the screen shot . It worked for me&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="hadoop_cluster_settings.JPG" style="width: 578px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M2NP.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/155360iF0964C9BAF28E41F/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M2NP.jpg" alt="0683p000009M2NP.jpg" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 01 Feb 2019 11:51:31 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/How-to-set-the-quot-org-apache-spark-serializer-KryoSerialize/m-p/2235954#M24836</guid>
      <dc:creator>badri-nair</dc:creator>
      <dc:date>2019-02-01T11:51:31Z</dc:date>
    </item>
  </channel>
</rss>

