Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I went through various threads in the community and I couldn't find any relevant link or solution.
We have XML of multiple structure. We want to build generic job which will take the different XML as input and provide us with id, parentid, depth, name, value and xpath values from the input XML.
Is it possible to get these values using Talend?
Hi,
Thanks for the reply.
We have around 100+ xml files with different structure, we want to avoid creating 100+ jobs to handle each of the XML file. So we are looking to design generic job which can handle all the file in loop and provide us the shredded XML value.
Also the XML are stored in oracle table in clob datatype.
We are looking for feasibility of above solution using talend.
In our current application we are using Oracle XML functions to shred the XML files dynamically, now we are planning to migrating from Oracle platform to Big Data platform and we are planning to use Talend for the data ingestion.
So instead of using the Oracle XML functions again, we are exploring options to implement the same functionality what Oracle XML function provides using Talend components.
OK, here is some code to do this. It is a combination of XSLT and Java. I use it for something I was working on which sounds very similar to what you need. I hope it helps....
This is a routine used to enable you to run XSLT against XML in memory and output the result to a String. The "inData" is your XML as a String. The "xslFileData" is your XSLT.
public static String process(String inData, String xslFileData) throws FileNotFoundException { String returnString = ""; try { // Create transformer factory TransformerFactory factory = TransformerFactory.newInstance(); // Use the factory to create a template containing the xsl file Templates template = factory.newTemplates(new StreamSource( new StringReader(xslFileData))); // Use the template to create a transformer Transformer xformer = template.newTransformer(); // Prepare the input and output files Source source = new StreamSource(new StringReader(inData)); StringWriter outWriter = new StringWriter(); Result result = new StreamResult(outWriter); // Apply the xsl file to the source file and write the result // to the output file xformer.transform(source, result); StringBuffer sb = outWriter.getBuffer(); returnString = sb.toString(); } catch (TransformerConfigurationException e) { // An error occurred in the XSL file } catch (TransformerException e) { // An error occurred while applying the XSL file // Get location of error in input file SourceLocator locator = e.getLocator(); int col = locator.getColumnNumber(); int line = locator.getLineNumber(); String publicId = locator.getPublicId(); String systemId = locator.getSystemId(); } return returnString; }
You will need the following imports for the above Java....
import java.io.*; import javax.xml.transform.*; import javax.xml.transform.stream.*;
The XSLT you can use (you may wish to tweak this) is below (I borrowed this XSLT from here )....
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:variable name="vApos">'</xsl:variable> <xsl:template match="*[@* or not(*)] "> <xsl:if test="not(*)"> <xsl:apply-templates select="ancestor-or-self::*" mode="path"/> <xsl:value-of select="concat('=',$vApos,.,$vApos)"/> <xsl:text>
</xsl:text> </xsl:if> <xsl:apply-templates select="@*|*"/> </xsl:template> <xsl:template match="*" mode="path"> <xsl:value-of select="concat('/',name())"/> <xsl:variable name="vnumPrecSiblings" select= "count(preceding-sibling::*[name()=name(current())])"/> <xsl:if test="$vnumPrecSiblings"> <xsl:value-of select="concat('[', $vnumPrecSiblings +1, ']')"/> </xsl:if> </xsl:template> <xsl:template match="@*"> <xsl:apply-templates select="../ancestor-or-self::*" mode="path"/> <xsl:value-of select="concat('[@',name(), '=',$vApos,.,$vApos,']')"/> <xsl:text>
</xsl:text> </xsl:template> </xsl:stylesheet>
If you create a job that reads in your XML and the above XSLT as Strings and passes those Strings to a routine with the method I have given you, it will return the majority of what you require.
Hi,
Sorry for delayed response.
Thanks for the code, but I couldn't make it work as I am not Java guy.. I am from ETL and SQL background.
@ rhall_2_0, I have prepared the XSLT for my XML structure, but could identify how to use the XSLT in the above given code. Is it possible to provide the full code that need to be used.
Thanks in advance.