Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
dapharsyde
Contributor
Contributor

Xpath text() returns incomplete string if it contains &

I reproduced an issue I'm getting when attempting to retrieve data from an XML file using an xpath text() query whenever there is & in the string to be returned.

Here are the contents of a simple XML file I created:

<root><field id="name">A &amp;amp; B</field></root>

On an xpath testing site such as freeformatter.com, the xpath query of "/root/field[@id='name']/text()" returns the text: 'A &amp;amp; B' as expected.

0695b00000htxgwAAA.png0695b00000htxh1AAA.png 

However, in Talend, when I use a tFileInputXML component, using /root as the loop element and the same xpath query in the mapping section, the data returned is simply 'A'

0695b00000htxeHAAQ.png 

0695b00000htxeRAAQ.png 

Is this a bug in the tFileInputXML component? Or is there a way to update my xpath query so it returns the complete string?

Labels (2)
4 Replies
Anonymous
Not applicable

Hello @bon shih​ ,

If you setup the xpath query to "./field[@id='name']" in tFileInputXML component as the below, it will work as expected.

0695b00000htxhzAAA.png 

Best regards

Aiming

 

dapharsyde
Contributor
Contributor
Author

This is great, and appears to be working. Can you explain the difference between fetching the data with or without the text() function? When I initially looked for Xpath examples to return the text contained in a field, they all used the text() function. It's also puzzling that the Xpath validator accepted it without a problem. Will there be any cases where omitting this will cause an issue?

Anonymous
Not applicable

without the text() function, it will return the whole text of the xpath element. with the text() function, it may return one text of the element as it contains special character &amp;

if the text doesn't contains &, e.g. <root><field id="name">A amp;amp; B</field></root>, then the xpath

"/root/field[@id='name']/text()" and "/root/field[@id='name']" will return the same results

dapharsyde
Contributor
Contributor
Author

>with the text() function, it may return one text of the element

 

Is this behavior a result of a bug in the Talend component? My understanding is that text() should also return the entire string (regardless of special characters), just as it does on an xpath validator site such as freeformatter.com