<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Parsing complex XML with lists in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Parsing-complex-XML-with-lists/m-p/2303226#M75133</link>
    <description>What is an efficient way to load the below XML into a database? I want to parse the xml into three tables, item/detail/tag. I believe my loop XPath Query needs to be /database/item.
&lt;BR /&gt;I cannot figured out a way parse a list without using multiple FileInputXML. I have tried using ExtractXMLField but it throws an error, 'Error on line 3 of document : Content is not allowed in prolog. Nested exception: Content is not allowed in prolog.'
&lt;BR /&gt;I can load the file three times, but assume this would be very inefficienct. I have seen similar problems in the forum, none appear to have solutions.
&lt;BR /&gt;
&lt;BR /&gt;&amp;lt;database&amp;gt;
&lt;BR /&gt; &amp;lt;item id="111" clientName="SB"&amp;gt;
&lt;BR /&gt; &amp;lt;details&amp;gt;
&lt;BR /&gt; &amp;lt;detail child_id="1"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;Bis&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;amount&amp;gt;2&amp;lt;/amount&amp;gt;
&lt;BR /&gt; &amp;lt;/detail&amp;gt;
&lt;BR /&gt; &amp;lt;detail child_id="2"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;Asp&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;amount&amp;gt;20&amp;lt;/amount&amp;gt;
&lt;BR /&gt; &amp;lt;/detail&amp;gt;
&lt;BR /&gt; &amp;lt;/details&amp;gt;
&lt;BR /&gt; &amp;lt;tags&amp;gt;
&lt;BR /&gt; &amp;lt;tag tag_id="1"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;test&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;/tag&amp;gt;
&lt;BR /&gt; &amp;lt;/tags&amp;gt;
&lt;BR /&gt; &amp;lt;/item&amp;gt;
&lt;BR /&gt; &amp;lt;item id="112" clientName="GJ"&amp;gt;
&lt;BR /&gt; &amp;lt;details&amp;gt;
&lt;BR /&gt; &amp;lt;detail child_id="1"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;Lib&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;amount&amp;gt;1&amp;lt;/amount&amp;gt;
&lt;BR /&gt; &amp;lt;/detail&amp;gt;
&lt;BR /&gt; &amp;lt;/details&amp;gt;
&lt;BR /&gt; &amp;lt;tags&amp;gt;
&lt;BR /&gt; &amp;lt;tag tag_id="1"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;test&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;/tag&amp;gt;
&lt;BR /&gt; &amp;lt;tag tag_id="2"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;anothert&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;/tag&amp;gt;
&lt;BR /&gt; &amp;lt;/tags&amp;gt;
&lt;BR /&gt; &amp;lt;/item&amp;gt;
&lt;BR /&gt;&amp;lt;/database&amp;gt;</description>
    <pubDate>Sat, 16 Nov 2024 12:24:52 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2024-11-16T12:24:52Z</dc:date>
    <item>
      <title>Parsing complex XML with lists</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Parsing-complex-XML-with-lists/m-p/2303226#M75133</link>
      <description>What is an efficient way to load the below XML into a database? I want to parse the xml into three tables, item/detail/tag. I believe my loop XPath Query needs to be /database/item.
&lt;BR /&gt;I cannot figured out a way parse a list without using multiple FileInputXML. I have tried using ExtractXMLField but it throws an error, 'Error on line 3 of document : Content is not allowed in prolog. Nested exception: Content is not allowed in prolog.'
&lt;BR /&gt;I can load the file three times, but assume this would be very inefficienct. I have seen similar problems in the forum, none appear to have solutions.
&lt;BR /&gt;
&lt;BR /&gt;&amp;lt;database&amp;gt;
&lt;BR /&gt; &amp;lt;item id="111" clientName="SB"&amp;gt;
&lt;BR /&gt; &amp;lt;details&amp;gt;
&lt;BR /&gt; &amp;lt;detail child_id="1"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;Bis&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;amount&amp;gt;2&amp;lt;/amount&amp;gt;
&lt;BR /&gt; &amp;lt;/detail&amp;gt;
&lt;BR /&gt; &amp;lt;detail child_id="2"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;Asp&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;amount&amp;gt;20&amp;lt;/amount&amp;gt;
&lt;BR /&gt; &amp;lt;/detail&amp;gt;
&lt;BR /&gt; &amp;lt;/details&amp;gt;
&lt;BR /&gt; &amp;lt;tags&amp;gt;
&lt;BR /&gt; &amp;lt;tag tag_id="1"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;test&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;/tag&amp;gt;
&lt;BR /&gt; &amp;lt;/tags&amp;gt;
&lt;BR /&gt; &amp;lt;/item&amp;gt;
&lt;BR /&gt; &amp;lt;item id="112" clientName="GJ"&amp;gt;
&lt;BR /&gt; &amp;lt;details&amp;gt;
&lt;BR /&gt; &amp;lt;detail child_id="1"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;Lib&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;amount&amp;gt;1&amp;lt;/amount&amp;gt;
&lt;BR /&gt; &amp;lt;/detail&amp;gt;
&lt;BR /&gt; &amp;lt;/details&amp;gt;
&lt;BR /&gt; &amp;lt;tags&amp;gt;
&lt;BR /&gt; &amp;lt;tag tag_id="1"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;test&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;/tag&amp;gt;
&lt;BR /&gt; &amp;lt;tag tag_id="2"&amp;gt;
&lt;BR /&gt; &amp;lt;name&amp;gt;anothert&amp;lt;/name&amp;gt;
&lt;BR /&gt; &amp;lt;/tag&amp;gt;
&lt;BR /&gt; &amp;lt;/tags&amp;gt;
&lt;BR /&gt; &amp;lt;/item&amp;gt;
&lt;BR /&gt;&amp;lt;/database&amp;gt;</description>
      <pubDate>Sat, 16 Nov 2024 12:24:52 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Parsing-complex-XML-with-lists/m-p/2303226#M75133</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T12:24:52Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing complex XML with lists</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Parsing-complex-XML-with-lists/m-p/2303227#M75134</link>
      <description>Hi
&lt;BR /&gt;Because you want to parse the xml into three tables, loop XPath Queries will be different.
&lt;BR /&gt;You can use tFileInputMSXML component here.
&lt;BR /&gt;The following images may help you.
&lt;BR /&gt;Using three tFileInputXML components with "Multi thread execution" is another workaround.
&lt;BR /&gt;But if the xml file is large, the job will need more memory.
&lt;BR /&gt; 
&lt;BR /&gt;Regards,
&lt;BR /&gt;Pedro</description>
      <pubDate>Wed, 01 Feb 2012 02:20:35 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Parsing-complex-XML-with-lists/m-p/2303227#M75134</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2012-02-01T02:20:35Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing complex XML with lists</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Parsing-complex-XML-with-lists/m-p/2303228#M75135</link>
      <description>Thanks, this works. In case other have this problem, I also had to set generation mode in advanced options to use SAX. The DOM didn't retrieve the nested lists. 
&lt;BR /&gt;Another method we want to try for our import we is to parse the xml and loop through /database/item, retrieve the item's xml and insert the item's xml information into the database for later use. This doesn't seem to be easy to do. A work around would be to modify the produced java and run that. 
&lt;BR /&gt; - How do I make an XPath query to return a string of the xml? 
&lt;BR /&gt; - If we do need to change the produced java code, how do I make the change permanent? 
&lt;BR /&gt;Thanks again, 
&lt;BR /&gt;Kevin</description>
      <pubDate>Wed, 01 Feb 2012 19:24:46 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Parsing-complex-XML-with-lists/m-p/2303228#M75135</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2012-02-01T19:24:46Z</dc:date>
    </item>
  </channel>
</rss>

