Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello, I have a XML file with many rows, and in my tXmlMap, I need one rows who contains html
In this rows, I use the html tag in my tXmlMap, but he stop read at the first line and talend send me error
ORA-01400: Cannot insert NULL into ("DB"."table"."column")
But, other xml file with many html rows , its working for exemple, after my <p> Hello, </p> I press enter to make a new line, its working
Edit: I tried to use this
StringHandling.EREPLACE(row2.html,"</p>","</p><br>")
but nothing
OK. I have a bit of a hack you can use. It is relatively convoluted, but it works. This is how you do it.
1) Read the data in as a String using a tFileInputRaw component.
2) The schema of the tFileInputRaw component will be a column called "content" of type String. Connect this to a tConvertType and convert it to a String.
3) Connect a tJavaFlex and use the code below.....
row11.content = row12.content.replaceAll("<p>", "").replaceAll("</p>", "").replaceAll("\\<\\?xml(.+?)\\?\\>", "").replaceAll("\\<\\?mso(.+?)\\?\\>", "").replaceAll("\\s{2,}", "").replaceAll("[^\\x20-\\x7e]", "").trim();
This removes all of the rubbish that Talend does not like and creates a reasonably well formatted piece of XML. It also removes all <p> and </p> tags.
4) Connect this output to a tConvertType and convert the String to a Document.
5) Connect to a tExtractXMLField and use the XPaths you used before.
This works. I've tried it with your sample file with and without unmatched <p> tags.
I'm afraid your explanation of your problem is not very clear. Are you trying to interrogate HTML using a tXMLMap?
Hello ( sorry for my bad English :s )
So I have on xml file , and in this, i need the tag, so i put in my tXmlMap but in my tLogRow , it show me nothing while my tag contains one rows ( at the top of the page screen, this is the error returned)
And when i want to insert this data, I got error " NULL" but in my XML file ther is not null
Can you post an example of the XML and the element that you want to return? There may be a different component you can use as I suspect the tXMLMap will not be suitable.
This is my xml tag who contains the html information ( I need all informations in all <P> tag
This is my logrow , he show me nothing
This is my txmlmap colomns ( main row)
And in my tFileInputXML, in the XPath request for my html, i used //* at end for read all line, and now, he show me the rows BUT only the first <p> tag and not all <p> tag
That is because your <html> element contains only another element as far as the XMLMap is concerned. Therefore it is quite rights returning null. Try using the tExtractXMLField to get this HTML tag. You will need to use XPaths for this. You will also need to tick the Get Nodes" box for the column you want to receive the value in.
Oh okey, I see, but how i should use the ExtractXMLfield ? because I got " Error on line1 of document, Nested exception"
@rhall wrote:
That is because your <html> element contains only another element as far as the XMLMap is concerned. Therefore it is quite rights returning null. Try using the tExtractXMLField to get this HTML tag. You will need to use XPaths for this. You will also need to tick the Get Nodes" box for the column you want to receive the value in.
You need to be working with a Document. If it is a String, you need to convert it to a Document using a tConvertType component. If you are using HTML and NOT an XML Document, it won't work in many cases. It MUST be XML
So i'm using a tHttpRequest to get my file and tFileInputXML for read it .
I replace my tFileInputXml by the tExtractXMLField ?
Problem is <p> tag