Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Using HTML tag on xml file from txmlmap

Hello, I have a XML file with many rows, and in my tXmlMap, I need one rows who contains html

0683p000009LscJ.png

In this rows, I use the html tag in my tXmlMap, but he stop read at the first line and talend send me error

ORA-01400: Cannot insert NULL into ("DB"."table"."column")

But, other xml file with many html rows , its working for exemple, after my <p> Hello, </p> I press enter to make a new line, its working

 

Edit: I tried to use this

StringHandling.EREPLACE(row2.html,"</p>","</p><br>") 

but nothing

 

 

Labels (1)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

OK. I have a bit of a hack you can use. It is relatively convoluted, but it works. This is how you do it.

 

1) Read the data in as a String using a tFileInputRaw component.

2) The schema of the tFileInputRaw component will be a column called "content" of type String. Connect this to a tConvertType and convert it to a String.

3) Connect a tJavaFlex and use the code below.....

row11.content = row12.content.replaceAll("<p>", "").replaceAll("</p>", "").replaceAll("\\<\\?xml(.+?)\\?\\>", "").replaceAll("\\<\\?mso(.+?)\\?\\>", "").replaceAll("\\s{2,}", "").replaceAll("[^\\x20-\\x7e]", "").trim();

This removes all of the rubbish that Talend does not like and creates a reasonably well formatted piece of XML. It also removes all <p> and </p> tags.

4) Connect this output to a tConvertType and convert the String to a Document.

5) Connect to a tExtractXMLField and use the XPaths you used before.

 

This works. I've tried it with your sample file with and without unmatched <p> tags.

View solution in original post

42 Replies
Anonymous
Not applicable
Author

I'm afraid your explanation of your problem is not very clear. Are you trying to interrogate HTML using a tXMLMap?

Anonymous
Not applicable
Author

Hello ( sorry for my bad English :s )

So I have on xml file , and in this, i need the tag, so i put in my tXmlMap  but in my tLogRow , it show me nothing while my tag contains  one rows ( at the top of the page screen, this is the error returned)

And when i want to insert this data, I got error " NULL" but in my XML file ther is not null

 

Anonymous
Not applicable
Author

Can you post an example of the XML and the element that you want to return? There may be a different component you can use as I suspect the tXMLMap will not be suitable.

Anonymous
Not applicable
Author

This is my xml tag who contains the html information ( I need all informations in all <P> tag 0683p000009LsOU.png

 

 

This is my logrow , he show me nothing

0683p000009LsdH.png

 

This is my txmlmap colomns ( main row)

0683p000009LsWW.png

 And in my tFileInputXML, in the XPath request for my html, i used //* at end for read all line, and now, he show me the rows BUT only the first  <p> tag and not all <p> tag

Anonymous
Not applicable
Author

That is because your <html> element contains only another element as far as the XMLMap is concerned. Therefore it is quite rights returning null. Try using the tExtractXMLField to get this HTML tag. You will need to use XPaths for this. You will also need to tick the Get Nodes" box for the column you want to receive the value in.

Anonymous
Not applicable
Author

Oh okey, I see, but how i should use the ExtractXMLfield ? because I got " Error on line1 of document, Nested exception"


@rhall wrote:

That is because your <html> element contains only another element as far as the XMLMap is concerned. Therefore it is quite rights returning null. Try using the tExtractXMLField to get this HTML tag. You will need to use XPaths for this. You will also need to tick the Get Nodes" box for the column you want to receive the value in.


 

Anonymous
Not applicable
Author

You need to be working with a Document. If it is a String, you need to convert it to a Document using a tConvertType component. If you are using HTML and NOT an XML Document, it won't work in many cases. It MUST be XML

Anonymous
Not applicable
Author

So i'm using a tHttpRequest to get my file and tFileInputXML  for read it .

0683p000009LsG5.png

I replace my tFileInputXml by the tExtractXMLField ?

Anonymous
Not applicable
Author

Problem is <p> tag