Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026! Turn data into bold moves, April 13 -15: Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

html to xml

Is there any component or way to convert html files to xml document?
I am nore interested in body and title tags. Everything in <body> of html can stay in <body> tag of xml
Cheers.
Labels (4)
10 Replies
Anonymous
Not applicable
Author

Hi
There is no component can be used to convert html file to xml file directly, you have to extract records from html file and then insert them into xml file.
Consider the following job design to extract desired records from html file:
tFileInputFullRow--main-->tFilterRow-->tExtractRegexFields.
tFileInputFullRow: read each row of html file one by one
tFilterRow: filter the desired row, for example: row startsWith <body>
tExtractRegexFields: use regular expresstion to extract fields
Best regards
Shong
Anonymous
Not applicable
Author

Thanks shong for your reply. I tried it. However, I get this "advanced condition failed" error from tFileInputFullRow.
Log output:
connecting to socket on port 3654
connected
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">|advanced condition failed
<head>|advanced condition failed
<meta http-equiv="content-type" content="text/html;" />|advanced condition failed
<title>Partnerships</title>|advanced condition failed
</head>|advanced condition failed
<body><h1 class="entry-title" style="margin-bottom:25px;">Partnerships</h1>|advanced condition failed
.....
....
Anonymous
Not applicable
Author

Hi
i tested to read a html file using tFileInputFullRow and I don't have any problem, can you please send me an example file for testing.
Best regards
Shong
Anonymous
Not applicable
Author

Thanks..I restarted it and it works fine.
However, when I try input_row.htmlstring.startsWith("<body>") in tfilterrow component then I see only first line. Seems that it breaks when there is new line within body tag. How can I solve this?
Anonymous
Not applicable
Author

Is there any component or way to convert xml or csv or from database to html files
Regards
Kishore
_AnonymousUser
Specialist III
Specialist III

if you want to convert an xml file to an html one, you need just to use an xsl transformation.
you may use this model as a transformation job
tFileOutputXML -----> tFileList (in case you want to do this for a group of files) ----> tXSLT
this works, i already tried it.
Anonymous
Not applicable
Author

Another way to go from XML -> HTML, CSV, and PDF is to use a Jasper Report. This video shows how to use the Jasper Report IDE "iReport" to build a report off of an XML document. The iReport product can be called from a Talend component.
http://youtu.be/Y_JMUv7GiK8
Anonymous
Not applicable
Author

Thank you friends its working
Regards
Kishore
Anonymous
Not applicable
Author

Hi friends,
New job please tell me how to extract data from HTML files in4.2 version.
regards,
Kishore