<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: [resolved] Split really big xml file in multiple XML files in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316018#M86604</link>
    <description>Thank you for your help Mbaroudi ! 
&lt;BR /&gt;I just find that this morning : 
&lt;A href="http://linux.die.net/man/1/xml_split" rel="nofollow noopener noreferrer"&gt;http://linux.die.net/man/1/xml_split&lt;/A&gt; 
&lt;BR /&gt;This linux command split the file in file of the chosen size and keep the sml structure. 
&lt;BR /&gt;But I think I'm going to try your way Mbaroudi (so the job will be running correctly on Windows if needed) 
&lt;BR /&gt;Pikerman : sorry I'm don't no much about php (create a topic about this).</description>
    <pubDate>Tue, 16 Jul 2013 16:10:20 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2013-07-16T16:10:20Z</dc:date>
    <item>
      <title>[resolved] Split really big xml file in multiple XML files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316014#M86600</link>
      <description>Hello, 
&lt;BR /&gt;I have to split a 160 Go XML file. 
&lt;BR /&gt;I found a solution in this topic : 
&lt;A href="https://community.qlik.com/s/feed/0D53p00007vCjt6CAC" target="_blank" rel="nofollow noopener noreferrer"&gt;https://community.talend.com/t5/Design-and-Development/resolved-Can-we-Split-One-XML-File-into-Multiple-XML-files-using/td-p/70774&lt;/A&gt; 
&lt;BR /&gt;But my file is so big (160Go...) that I can't use tFileInputXML: I face an OutOfMemory error. 
&lt;BR /&gt;So I wonder if there is another way to split huge XML files using Talend ? (or maybe a little program that I can run from the tSSH component) 
&lt;BR /&gt; 
&lt;BR /&gt;Just for your information this what the XML file looks like: 
&lt;BR /&gt; 
&lt;PRE&gt;&amp;lt;ExampleDatabase&amp;gt;&lt;BR /&gt;	&amp;lt;DatabaseEntry&amp;gt;&lt;BR /&gt;		A lot of things.&lt;BR /&gt;	&amp;lt;/DatabaseEntry&amp;gt;&lt;BR /&gt;	&amp;lt;DatabaseEntry&amp;gt;&lt;BR /&gt;		Other things&lt;BR /&gt;	&amp;lt;/DatabaseEntry&amp;gt;&lt;BR /&gt;	&amp;lt;DatabaseEntry&amp;gt;&lt;BR /&gt;		Other things again&lt;BR /&gt;	&amp;lt;/DatabaseEntry&amp;gt;&lt;BR /&gt;&amp;lt;/ExampleDatabase&amp;gt;&lt;/PRE&gt; 
&lt;BR /&gt;I want to split it between two &amp;lt;DatabaseEntry&amp;gt;. 
&lt;BR /&gt; 
&lt;BR /&gt;Thank you.</description>
      <pubDate>Mon, 15 Jul 2013 15:38:51 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316014#M86600</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-07-15T15:38:51Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Split really big xml file in multiple XML files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316015#M86601</link>
      <description>In Talend 5.3.1 this component has an advanced option: Generation Mode: Fast and low memory consumption (SAX).</description>
      <pubDate>Mon, 15 Jul 2013 15:58:22 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316015#M86601</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-07-15T15:58:22Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Split really big xml file in multiple XML files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316016#M86602</link>
      <description>Yes, I know I already use Sax. But even with it, 160 Go XML files are way too big.</description>
      <pubDate>Mon, 15 Jul 2013 16:14:53 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316016#M86602</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-07-15T16:14:53Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Split really big xml file in multiple XML files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316017#M86603</link>
      <description>Hi,
&lt;BR /&gt;You can use XSLT to split a huge xml file by Talend tXSLT component :
&lt;BR /&gt;Source code:
&lt;BR /&gt;
&lt;PRE&gt;&amp;lt;xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&amp;gt;&lt;BR /&gt;  &amp;lt;xsl:output method="xml" indent="yes"/&amp;gt;&lt;BR /&gt;  &amp;lt;xsl:param name="startPosition"/&amp;gt;&lt;BR /&gt;  &amp;lt;xsl:param name="endPosition"/&amp;gt;&lt;BR /&gt;  &amp;lt;xsl:template match="@* | node()"&amp;gt;&lt;BR /&gt;      &amp;lt;xsl:copy&amp;gt;&lt;BR /&gt;          &amp;lt;xsl:apply-templates select="@* | node()"/&amp;gt;&lt;BR /&gt;      &amp;lt;/xsl:copy&amp;gt; &lt;BR /&gt;  &amp;lt;/xsl:template&amp;gt;&lt;BR /&gt;  &amp;lt;xsl:template match="header"&amp;gt;&lt;BR /&gt;    &amp;lt;xsl:copy&amp;gt;&lt;BR /&gt;      &amp;lt;xsl:apply-templates select="DatabaseEntry"/&amp;gt;&lt;BR /&gt;    &amp;lt;/xsl:copy&amp;gt;&lt;BR /&gt;  &amp;lt;/xsl:template&amp;gt;&lt;BR /&gt;  &amp;lt;xsl:template match="DatabaseEntry"&amp;gt;&lt;BR /&gt;    &amp;lt;xsl:if test="position() &amp;gt;= $startPosition and position() &amp;lt;= $endPosition"&amp;gt;&lt;BR /&gt;      &amp;lt;xsl:copy&amp;gt;&lt;BR /&gt;        &amp;lt;xsl:apply-templates select="@* | node()"/&amp;gt;&lt;BR /&gt;      &amp;lt;/xsl:copy&amp;gt;&lt;BR /&gt;    &amp;lt;/xsl:if&amp;gt;&lt;BR /&gt;  &amp;lt;/xsl:template&amp;gt;&lt;BR /&gt;&amp;lt;/xsl:stylesheet&amp;gt;&lt;/PRE&gt;
&lt;BR /&gt;(Note, by the way, that because this is based on the identity transform, it works even if header isn't the top-level element.)
&lt;BR /&gt;You still need to count the DatabaseEntry elements in the source XML, and run the transform repeatedly with the values of Parameters $startPosition and $endPosition that are appropriate for the situation .</description>
      <pubDate>Tue, 16 Jul 2013 09:58:18 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316017#M86603</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-07-16T09:58:18Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Split really big xml file in multiple XML files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316018#M86604</link>
      <description>Thank you for your help Mbaroudi ! 
&lt;BR /&gt;I just find that this morning : 
&lt;A href="http://linux.die.net/man/1/xml_split" rel="nofollow noopener noreferrer"&gt;http://linux.die.net/man/1/xml_split&lt;/A&gt; 
&lt;BR /&gt;This linux command split the file in file of the chosen size and keep the sml structure. 
&lt;BR /&gt;But I think I'm going to try your way Mbaroudi (so the job will be running correctly on Windows if needed) 
&lt;BR /&gt;Pikerman : sorry I'm don't no much about php (create a topic about this).</description>
      <pubDate>Tue, 16 Jul 2013 16:10:20 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Split-really-big-xml-file-in-multiple-XML-files/m-p/2316018#M86604</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-07-16T16:10:20Z</dc:date>
    </item>
  </channel>
</rss>

