<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Problem with processing huge XML file with tFileInputDelimited in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362805#M126867</link>
    <description>Hi, I have made a really simple job to remove the header (4 lines) and the last line of a really big xml document (more than 100 Go) encoded in ISO-8859-1. 
&lt;BR /&gt;This is really simple : I use the tFileInputDelimited to read the document line by line and remove the 4 lines header. 
&lt;BR /&gt;Then the tReplace is used to remove the last tag (&amp;lt;\IproClassDatabse&amp;gt;) (didn't find any other solution for such a big file). 
&lt;BR /&gt;But when the job is done the new file (without header and last line) have half less lines than the original (it should have 5 lines less) ! 
&lt;BR /&gt;By using the tail command I can see that the new xml document doesn't end as the original xml document. The job seems to have stopped to process the document. 
&lt;BR /&gt;I have tried this job with smaller xml document and there is no error... 
&lt;BR /&gt; 
&lt;BR /&gt;This is a really really simple job, so I really don't get where is the problem. Even if the xml document is really big (120Go) it shouldn't be a problem, it just take some times to be done. 
&lt;BR /&gt;Anyone already met a similar problem or have an idea where the problem comes from ? 
&lt;BR /&gt;Screenshot of the job : 
&lt;BR /&gt; 
&lt;A href="http://i.imgur.com/T8tOlkUh.png" target="_blank" rel="nofollow noopener noreferrer"&gt;http://i.imgur.com/T8tOlkUh.png&lt;/A&gt; 
&lt;BR /&gt; 
&lt;A href="http://i.imgur.com/FGc4vu4h.png" target="_blank" rel="nofollow noopener noreferrer"&gt;http://i.imgur.com/FGc4vu4h.png&lt;/A&gt; 
&lt;BR /&gt; 
&lt;A href="http://i.imgur.com/diYxizxh.png" target="_blank" rel="nofollow noopener noreferrer"&gt;http://i.imgur.com/diYxizxh.png&lt;/A&gt;</description>
    <pubDate>Mon, 12 Aug 2013 22:17:10 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2013-08-12T22:17:10Z</dc:date>
    <item>
      <title>Problem with processing huge XML file with tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362805#M126867</link>
      <description>Hi, I have made a really simple job to remove the header (4 lines) and the last line of a really big xml document (more than 100 Go) encoded in ISO-8859-1. 
&lt;BR /&gt;This is really simple : I use the tFileInputDelimited to read the document line by line and remove the 4 lines header. 
&lt;BR /&gt;Then the tReplace is used to remove the last tag (&amp;lt;\IproClassDatabse&amp;gt;) (didn't find any other solution for such a big file). 
&lt;BR /&gt;But when the job is done the new file (without header and last line) have half less lines than the original (it should have 5 lines less) ! 
&lt;BR /&gt;By using the tail command I can see that the new xml document doesn't end as the original xml document. The job seems to have stopped to process the document. 
&lt;BR /&gt;I have tried this job with smaller xml document and there is no error... 
&lt;BR /&gt; 
&lt;BR /&gt;This is a really really simple job, so I really don't get where is the problem. Even if the xml document is really big (120Go) it shouldn't be a problem, it just take some times to be done. 
&lt;BR /&gt;Anyone already met a similar problem or have an idea where the problem comes from ? 
&lt;BR /&gt;Screenshot of the job : 
&lt;BR /&gt; 
&lt;A href="http://i.imgur.com/T8tOlkUh.png" target="_blank" rel="nofollow noopener noreferrer"&gt;http://i.imgur.com/T8tOlkUh.png&lt;/A&gt; 
&lt;BR /&gt; 
&lt;A href="http://i.imgur.com/FGc4vu4h.png" target="_blank" rel="nofollow noopener noreferrer"&gt;http://i.imgur.com/FGc4vu4h.png&lt;/A&gt; 
&lt;BR /&gt; 
&lt;A href="http://i.imgur.com/diYxizxh.png" target="_blank" rel="nofollow noopener noreferrer"&gt;http://i.imgur.com/diYxizxh.png&lt;/A&gt;</description>
      <pubDate>Mon, 12 Aug 2013 22:17:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362805#M126867</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-08-12T22:17:10Z</dc:date>
    </item>
    <item>
      <title>Re: Problem with processing huge XML file with tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362806#M126868</link>
      <description>Hi 
&lt;BR /&gt;To read a file line by line, I would suggest you to use tFileInputFullRow. 
&lt;BR /&gt;Shong</description>
      <pubDate>Tue, 13 Aug 2013 15:48:07 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362806#M126868</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-08-13T15:48:07Z</dc:date>
    </item>
    <item>
      <title>Re: Problem with processing huge XML file with tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362807#M126869</link>
      <description>Hi, thanks for the answer. But I have just tried it and it is the same problem : it stops at the same place.</description>
      <pubDate>Tue, 13 Aug 2013 19:23:03 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362807#M126869</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-08-13T19:23:03Z</dc:date>
    </item>
    <item>
      <title>Re: Problem with processing huge XML file with tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362808#M126870</link>
      <description>Hi 
&lt;BR /&gt;The job is really simple, and I don't see something wrong in the job settings, which version are you using? Does the job end normally without error?
&lt;BR /&gt;Shong</description>
      <pubDate>Wed, 14 Aug 2013 02:53:54 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362808#M126870</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-08-14T02:53:54Z</dc:date>
    </item>
    <item>
      <title>Re: Problem with processing huge XML file with tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362809#M126871</link>
      <description>Hi,&lt;BR /&gt;I am using Talend Open Studio for ESB (5.3.0.r101800).&lt;BR /&gt;And the job ends normally, without error.</description>
      <pubDate>Wed, 14 Aug 2013 15:53:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362809#M126871</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-08-14T15:53:36Z</dc:date>
    </item>
    <item>
      <title>Re: Problem with processing huge XML file with tFileInputDelimited</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362810#M126872</link>
      <description>Who knows if tBoostedFileInputXML component can handle that kind of files too....</description>
      <pubDate>Sat, 13 Dec 2014 19:51:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Problem-with-processing-huge-XML-file-with-tFileInputDelimited/m-p/2362810#M126872</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-12-13T19:51:45Z</dc:date>
    </item>
  </channel>
</rss>

