<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Handling Huge XML files in Talend - OutOfMemoryError (tAdvancedFileOutputXML) in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329820#M98975</link>
    <description>Yes, but this task is much easier to do. Simply remove all root tags (aka let become them fragments) and join the files.
&lt;BR /&gt;
&lt;BR /&gt;</description>
    <pubDate>Tue, 05 Sep 2017 08:39:50 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2017-09-05T08:39:50Z</dc:date>
    <item>
      <title>Handling Huge XML files in Talend - OutOfMemoryError (tAdvancedFileOutputXML)</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329817#M98972</link>
      <description>&lt;P&gt;Hello all,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I am trying to generate a large xml file using tAdvancedFileOutputXML.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;When running on local machine, i am getting an "OutOfMemoryError" :&amp;nbsp;&lt;FONT color="#FF0000"&gt;Exception in thread "main" java.lang.OutOfMemoryError: Java heap space&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Please see my configuration below + some screenshots of the corresponding job:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Talend version : 6.3.1&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Main input file (csv) : 88,151 rows&lt;/P&gt; 
&lt;P&gt;lookup file (csv) : 5,994,268 rows&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sfg.PNG" style="width: 867px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LwX3.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/141061iC97E56E47159D7F2/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LwX3.png" alt="0683p000009LwX3.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;1. To enable optimization, i have enabled "Store temp data" in the tMap lookup&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capturerr.PNG" style="width: 543px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lw0W.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/134970i27787F77F60558FA/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lw0W.png" alt="0683p000009Lw0W.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;2. I have changed the Generation mode to : "Fast with low memory consumption"&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sdf.PNG" style="width: 386px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LwK6.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/149671i4EB2CCD6AEA2E683/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LwK6.png" alt="0683p000009LwK6.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;3. As my local machine has 8Gb RAM, i have also changed the JVM :&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Captureff.PNG" style="width: 259px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LwEH.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/132190iF79CE96B93161698/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LwEH.png" alt="0683p000009LwEH.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;4. I have also made use of the "output stream" option using an tJava component:&lt;BR /&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capturep.PNG" style="width: 457px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LwXD.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/154182i7AA4B7FDF6B1391A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LwXD.png" alt="0683p000009LwXD.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;And on tJava component contain :&amp;nbsp;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capturert.PNG" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LwXS.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/142766i52CDD63EAF982BB6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LwXS.png" alt="0683p000009LwXS.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Despite these settings, i am still not able to generate the Xml file and stuck with the&amp;nbsp;&lt;SPAN&gt;OutOfMemory error.&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Can you advice please? Thank you.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 09:20:18 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329817#M98972</guid>
      <dc:creator>RA6</dc:creator>
      <dc:date>2024-11-16T09:20:18Z</dc:date>
    </item>
    <item>
      <title>Re: Handling Huge XML files in Talend - OutOfMemoryError (tAdvancedFileOutputXML)</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329818#M98973</link>
      <description>&lt;P&gt;It is actual never a good idea to create one huge xml file. The problem is not only the creation process, it is also the next part - reading such a huge file.&lt;/P&gt;&lt;P&gt;What about creating multiple files instead of one?&lt;/P&gt;</description>
      <pubDate>Mon, 04 Sep 2017 22:02:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329818#M98973</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-09-04T22:02:10Z</dc:date>
    </item>
    <item>
      <title>Re: Handling Huge XML files in Talend - OutOfMemoryError (tAdvancedFileOutputXML)</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329819#M98974</link>
      <description>&lt;P&gt;Hello Jlolling,&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;Thank you for your reply and I understand completely your idea.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;I have tried splitting the xml into multiple files and it is much faster.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;The problem is at the end, we will have to merge them to create one which will be handled by an application that can only accept one file during run-time.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;The finally xml should be about 1.5 Gb.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;Do you have any idea show can the actual job be optimized?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Sep 2017 08:18:55 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329819#M98974</guid>
      <dc:creator>RA6</dc:creator>
      <dc:date>2017-09-05T08:18:55Z</dc:date>
    </item>
    <item>
      <title>Re: Handling Huge XML files in Talend - OutOfMemoryError (tAdvancedFileOutputXML)</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329820#M98975</link>
      <description>Yes, but this task is much easier to do. Simply remove all root tags (aka let become them fragments) and join the files.
&lt;BR /&gt;
&lt;BR /&gt;</description>
      <pubDate>Tue, 05 Sep 2017 08:39:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329820#M98975</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-09-05T08:39:50Z</dc:date>
    </item>
    <item>
      <title>Re: Handling Huge XML files in Talend - OutOfMemoryError (tAdvancedFileOutputXML)</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329821#M98976</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Indeed the root tag should be removed.&lt;/P&gt; 
&lt;P&gt;I am new in using Talend, can you tell me how can this be implemented; i mean keeping only one root-tag in the file?&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Also, despite the root-tag, i have tried to merge them using t Unite component, but it creates a blank row after each line which results in size increase :&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Captured.PNG" style="width: 500px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LwBe.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/138222i02295871AD354DF0/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LwBe.png" alt="0683p000009LwBe.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Sep 2017 09:22:25 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Handling-Huge-XML-files-in-Talend-OutOfMemoryError/m-p/2329821#M98976</guid>
      <dc:creator>RA6</dc:creator>
      <dc:date>2017-09-05T09:22:25Z</dc:date>
    </item>
  </channel>
</rss>

