Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to Suppress or Remove XML Declaration Header in payload document

Hello,

I am using a tXMLMap component to build a soap payload.  When I capture the payload into the tESBConsumer it contains an XML declaration header:

<?xml version="1.0" encoding="UTF-8"?>

I believe the web service is rejecting the call because of this line.  If I package the same payload without the XML declaration in SoapUI, the payload processes correctly.

Is there a way to suppress generation of this line?  If not, how would I strip it from the flow?   A screenshot of my simple job is below:

0683p000009LrUj.pngSample Flow with Payload and Error

Labels (4)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Here is my code attempt:

 

 public static routines.system.Document removeXMLHeader2(routines.system.Document document) {
    	
    	String xml_string = "";
    	
    	if(document!=null) {
    		xml_string = document.toString();
    		xml_string = xml_string.replaceAll("\\<\\?xml(.+?)\\?\\>", "").trim();
    	}
    	
    	DocumentBuilder db = null;
		try {
			db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
		} catch (ParserConfigurationException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}

		try {
			document = (routines.system.Document) db.parse(new ByteArrayInputStream(xml_string.getBytes("UTF-8")));
		} catch (UnsupportedEncodingException e) {
			e.printStackTrace();
		} catch (SAXException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
    	
    	return document;
    }

Error when running job and calling routine:

 

java.lang.ClassCastException: com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl cannot be cast to routines.system.Document
at routines.removeXMLHeader.removeXMLHeader2(removeXMLHeader.java:101)

View solution in original post

4 Replies
Anonymous
Not applicable
Author

Yes, XML Headers can be a real pain when converting from String to Document in Talend. I struggled over this for ages. In the end I created my own routine to remove the header. This is what I use.....

 

    public static String removeXMLHeader(String xml){
    	if(xml!=null){
    		xml = xml.replaceAll("\\<\\?xml(.+?)\\?\\>", "").trim();
    	}
    	
    	return xml;
    }

I call this method when I pass the XML as a String through a tMap. When I want to convert it back to a Document, I have no issues after using this.

Anonymous
Not applicable
Author

My apologies for the spoonfeeding I'm better at ETL than Java as will be obvious from my question. I created the routine. I gather I can do a document.toString() to get my payload into a String. What is the best way to convert the String back to Document?

Not sure if I need to remap all the bits or can simply cast it back to Document.
Anonymous
Not applicable
Author

Here is my code attempt:

 

 public static routines.system.Document removeXMLHeader2(routines.system.Document document) {
    	
    	String xml_string = "";
    	
    	if(document!=null) {
    		xml_string = document.toString();
    		xml_string = xml_string.replaceAll("\\<\\?xml(.+?)\\?\\>", "").trim();
    	}
    	
    	DocumentBuilder db = null;
		try {
			db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
		} catch (ParserConfigurationException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}

		try {
			document = (routines.system.Document) db.parse(new ByteArrayInputStream(xml_string.getBytes("UTF-8")));
		} catch (UnsupportedEncodingException e) {
			e.printStackTrace();
		} catch (SAXException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
    	
    	return document;
    }

Error when running job and calling routine:

 

java.lang.ClassCastException: com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl cannot be cast to routines.system.Document
at routines.removeXMLHeader.removeXMLHeader2(removeXMLHeader.java:101)

Anonymous
Not applicable
Author

Sorry, I answered this without really digesting your question. The code I gave you was for when converting from a String to a Document. You do not have an issue with XML header. This is standard in an XML document and you won't find one without the header. Your issue is with your configuration. You are missing a SOAP action according the error. Follow this Talend guide to the tESBConsumer component...

 

https://help.talend.com/reader/KxVIhxtXBBFymmkkWJ~O4Q/1CDi6NINp_q5p0PJbdgSnA