Skip to main content

Announcements
Only at Qlik Connect! Guest keynote Jesse Cole shares his secrets for daring to be different. Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
_AnonymousUser
Specialist III

XML file variable structure

Hello,
I have to build a datawarehouse and an ODS with xml files as input source.
I have the xsd file for the xml files.
But I have a problem, all xml files have differents structures. They share the same set of tags but some xml have more tags and some have less tags and in a different order.
I ll get an export file each week with a set of this xml file.
How to parse them? (It is possible? (because each xml change order)
For instance:
First xml file:
<student>
<name>Paul</name>
</student>
<college>Mineapolis</college>
Second file:
<student>
<name>Paul</name>
<age>17</age>
</student>
Thank you for your help.
Labels (3)
9 Replies
Anonymous
Not applicable

You can do this with tExtractXMLField components ( https://help.talend.com/search/all?query=tExtractXMLField&content-lang=en) and a bit of XPATH knowledge. 
There is absolutely no requirement for all tags to be present and all you would have to do is ensure that you have covered every permutation of potential structures (...it sounds like the structure will be pretty consistent, just missing tags on different files). The complicated bit comes with loops and complex structures. If you think (as a rule of thumb) that you will need a tExtractXMLField component for every loop/complex structure type and ensure that the structure is pass out of one tExtractXMLField component as a Node to the next tExtractXMLField component, you should be able to work your way through this. It won't be easy the first time you do it, but you will learn a lot doing it.
Anonymous
Not applicable

Thank you for your answer. But I don't understand one point. How can I define the schema in input?
Because It ll always change...
Anonymous
Not applicable

I see. I hadn't thought of that. But now that I have, you could use a tfileInputRaw and read the data in as a String, then use a tConvertType to convert the String to a Document. Then you will have your XML Document inside the job and you can use the method it described above to get useful data from it.
Anonymous
Not applicable

I can't parse the output of the toutputRaw from string to xml. I tried several type. Maybe it's impossible?
Anonymous
Not applicable

This will work.....
0683p000009MBNc.png
Anonymous
Not applicable

Sorry, but I've tried several times your solution. But it's seems that it's doesn't work... I can't see an output in my tlogrow...
Anyway, thank you for your help.
Anonymous
Not applicable

It's work sorry 0683p000009MACn.png
Anonymous
Not applicable

Thank you 0683p000009MACn.png
Anonymous
Not applicable

No problem  0683p000009MACn.png