Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

how to find and replace in xml

Hi,
I need to convert some XML files but I cannot figure out how.
The target will be the same XML except that for a particular element down in the tree, content of the element would be replaced with by some other string.
<? version="1.0" encoding="UTF-8"?>
<ADI>
<Metadata>
<AMS Product="MOD" Asset_Name="xxx" Description="yyy" Provider_ID="zzz" Creation_Date="2012-06-21" Asset_ID="0001" Version_Major="6" Version_Minor="0" />
<App_Data Value="xxx" Name="Metadata_Spec_Version" App="MOD"/>
<App_Data Value="" Name="Provider_Content_Tier" App="MOD"/>
</Metadata>
<Asset>
<Metadata>
<AMS Product="MOD" Asset_Name="xxx" Description="ppp" Provider_ID="01" Creation_Date="2012-06-21" Asset_ID="0001" Version_Major="6" Version_Minor="0" />
<App_Data Value="xxx" Name="Metadata_Spec_Version" App="MOD"/>
<App_Data ... />
<App_Data ... />
<App_Data ... />
...
</Metadata>
<Asset>
<Metadata>
<AMS Product="MOD" Asset_Name="xxx" Description="ppp" Provider_ID="01" Creation_Date="2012-06-21" Asset_ID="0001" Version_Major="6" Version_Minor="0" />
<App_Data ... />
<App_Data ... />
...
</Metadata>
<Content Value="aaa.mpg"/>
</Asset>
<Asset>
<Metadata>
<AMS Product="MOD" Asset_Name="yyy" Description="qqq" Provider_ID="02" Creation_Date="2012-06-21" Asset_ID="0002" Version_Major="6" Version_Minor="0" />
<App_Data ... />
...
</Metadata>
<Content Value="bbb.mpg"/>
</Asset>
<Asset>
<Metadata>
<AMS Product="MOD" Asset_Name="zzz" Description="rrr" Provider_ID="03" Creation_Date="2012-06-21" Asset_ID="0003" Version_Major="6" Version_Minor="0" />
<App_Data ... />
...
</Metadata>
<Content Value="ccc.jpg"/>
</Asset>
</Asset>
</ADI>

the target XML file will be all same except "aaa.mpg", "bbb.mpg" and "ccc.jpg" will be replaced by say "new_aaa.mpg", "new_bbb.mpg", etc.
Note that in the structure of the XML, Asset can be defined recursively.
I first thought of reading XML line by line and doing a String find/replace but this is not a robust solution as the file in question is XML and the "find string" can theoretically appear anywhere in the file.
so will that be something like?
tFileInputXML --> tXMLMap --> tFileOutputCSV
Labels (3)
4 Replies
Anonymous
Not applicable
Author

Hi
If you think simply replacement is not a robust solution, the best way is to create a job as seen below.
tFileInputXML-->tReplace-->tAdvancedOutputXML
Use treplace to replace strings in the specified column.
Then use tAdvancedOutuputXML to recreate this xml file.
Regards,
Pedro
Anonymous
Not applicable
Author

Hi Pedro,
Thank you very much for the answer.
Can I ask something more to understand things better.
In my proposed solution, at the first tFileInputXML, I was able to extract the whole root element into a "Document" type column and pass it to tXMLMap. After your answer, I understand that this is going to incur a second XML parse which is inefficient, but do you think, it is possible to do a find/replace with tXMLMap?
I am trying to understand the use cases for tXMLMap, as my case above will be doing something more complex in future. It will need to replace the value of a specific element by calculating it from sibling elements or from attributes of its parent element. Would that be practical to use a tXMLMap in those cases?
Anonymous
Not applicable
Author

Hi
You can use tXMLMap.
Just add tReplace between tFileInputXML and tXMLMap.
Regards,
Pedro
Anonymous
Not applicable
Author

Thank you Pedro but I think my problem is not about the replace bit as I think I could do it within tXML as well by using an itermediate Variable. After spending some time on the components I think it is more about handling a recursive XML structure in Talend. I will come up with a new topic on that.
Cheers.