Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm currently building a DataWarehouse and I have to fetch all sales data from POS into DB. I am able to extract and load the data to MSSQL DB, but for one of the xml node I actually would like to get the second node instead. The XML schema with the desire output as below:
The job is very simple like below.
If you notice here the <CloseTime> having 2 child nodes one for previous day business day and EOD, another for the current day bizday and eod.
I need to fetch the second node to combine with the sales header information.
Currently only the first node is fetch and successfully combine with the sales header information, but it should be the second node to be combine with the sales header. How do I achieve this in TXMLMap? Or do I need to use another component just to do that?
This is my first time using Talend so if you could recommend a solution with steps it would be best!
Thanks.
You will need to use a different component for this. When reading from XML I rarely use a tXMLMap since it is very restrictive. The best component (in my opinion) for working with reading XML is the tExtractXMLField component, used with XPath queries. XPath can be a pain to learn, but you can test things out online very easily (http://www.xpathtester.com/xpath).
For your problem here you will be wanting to return the nth instance of a loop. This problem I believe is answered here (I haven't tested it but it looks right)....
https://stackoverflow.com/questions/4007413/xpath-query-to-get-nth-instance-of-an-element
Hi, is that not possible using tXMLMap?
I need to extract the XML and split into different table in database eventually.
Basically the tFileInputXML will take every nodes from the root and tXMLMap will loop accordingly.
With tExtractXMLField I'm not sure if it can do the similar to break the XML content into seperate tables based on the node.
You could try moving your "loop" in your tXMLMap to the CloseType's Day element....but I suspect you have set this to Header for a reason, and you cannot have 2 of these.
The tExtractXMLField component gives you so much more control. You can do what you want to achieve using that with a tMap or a tXMLMap component
Yes, I need to loop the Header because it has many receipt node and one header in each receipt node. I need to retrieve every header nodes and store in a table in DB. Then the next thing will be line node - there can be multiple line node in each receipt node, so I need to loop through line nodes to retrieve and store in another table in DB. Sames goes to Payment node. So basically it looks like this:
<DailySales>
<Receipt>
<Header>...</Header>
<Line>...</Line>
<Line>...</Line>
<Line>...</Line>
<Payment>...</Payment>
</Receipt>
</DailySales>
I'm able to do it using tExtractXMLField, but when I loop on header so I supposed this to retrieve for each and every header node. So for line and payment nodes do I drop another tExtractXMLField? Currently it look like this:
I'm not able to output form the tFileInputXML_1 using Main again, the option left is Row -> Reject/Iterate. My objective is to extract from each xml files and split it into 3 DB tables.
Please advise, thanks.
Get the data as granular as you need it using the tExtractXMLField component and then connect to a tMap component. Use the tMap component to split your data into 3 streams
If I loop at the <Line> level, means there will be many duplicating header and payment information. How do I put this into tmap to split it accordingly and without duplication of header and payment information?
With the tExtractXMLField component you can extract a sub-document. Set the column type to be Document and tick the "Get Nodes" tick box. You can pass a separate document for each loop instance to the next component.
I've tried that, so for the first tExtractXMLField, I used <header> as the loop, then in the column I past in <Line> as document.
Strangely the output is only giving me the very first <line> in first <receipt> with many repetition (since I'm looping header).
I also tried using <receipt> (one level up from header/line/payment) as the xpath loop but the result is same.
Ok I tried again and this time is success, but may I know what should be the flow?
I thought about tFileList -> tFileInputXML -> tExtractXMLField (header only) -> tExtractXMLField (line only).
Where do I put the tmap here? how do I explicitly fetch header to SalesHeader table in DB and fetch line to SalesLine table in DB?
I thought I could do tFileInputXML and having multiple output to 3 tExtractXMLField (header,line,payment), but seems like only one row-> main is allowed.