Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026! Turn data into bold moves, April 13 -15: Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to load the HTML data into tables

Hi Team, 

can you help me how to load HTML data into tables by using Talend? 

I have attached sample HTML file. 

Regards

Jay

 

Labels (2)
1 Solution

Accepted Solutions
lojdr
Creator II
Creator II

Hello Jayrapolu,

 

Generally, HTML is a subset of XML therefore use XML components.

First thing, the file you attached is not a valid HTML file. There are missing some tags (e.g. <HTML></HTML>), some tags are not closed (e.g. <BODY>)... You have not specified what should be the output format and some other important conditions, so it is hard to provide you the exact answer, but...

 

The most important component is tXMLMap I think. See the attached screenshot (sorry for the naming convention). If we take only the important part of the HTML you provided:

<body>
<table cellpadding="0" cellspacing="0" border="0" width="100%">
				<tr>
					<td width="186" class="headlabel">CONSUMER:</td>
					<td width="320" class="headvalue">Jay</td>
					<td width="73"><img src="images/spacer.gif" /></td>
					<td width="118" class="headlabel">DATE:</td>
					<td width="128" class="headvalue">17-10-2017</td>
				</tr>
				<tr>
					<td class="headlabel">MEMBER ID:</td>
					<td class="headvalue">AA40238899_C2C1               </td>
					<td><img src="images/spacer.gif" /></td>
					<td class="headlabel">TIME:</td>
					<td class="headvalue">12:32:54</td>
				</tr>
</table>
</body>

You can use the following job to extract headlabels and headvalues.
0683p000009LsOc.png

I also attached an export of the job. 

 

I hope, that this will help you to solve this task.

 

Best regards

lojdr

 


htmlImport.zip

View solution in original post

2 Replies
lojdr
Creator II
Creator II

Hello Jayrapolu,

 

Generally, HTML is a subset of XML therefore use XML components.

First thing, the file you attached is not a valid HTML file. There are missing some tags (e.g. <HTML></HTML>), some tags are not closed (e.g. <BODY>)... You have not specified what should be the output format and some other important conditions, so it is hard to provide you the exact answer, but...

 

The most important component is tXMLMap I think. See the attached screenshot (sorry for the naming convention). If we take only the important part of the HTML you provided:

<body>
<table cellpadding="0" cellspacing="0" border="0" width="100%">
				<tr>
					<td width="186" class="headlabel">CONSUMER:</td>
					<td width="320" class="headvalue">Jay</td>
					<td width="73"><img src="images/spacer.gif" /></td>
					<td width="118" class="headlabel">DATE:</td>
					<td width="128" class="headvalue">17-10-2017</td>
				</tr>
				<tr>
					<td class="headlabel">MEMBER ID:</td>
					<td class="headvalue">AA40238899_C2C1               </td>
					<td><img src="images/spacer.gif" /></td>
					<td class="headlabel">TIME:</td>
					<td class="headvalue">12:32:54</td>
				</tr>
</table>
</body>

You can use the following job to extract headlabels and headvalues.
0683p000009LsOc.png

I also attached an export of the job. 

 

I hope, that this will help you to solve this task.

 

Best regards

lojdr

 


htmlImport.zip
Anonymous
Not applicable
Author

Thanks for the solution. Very much appreciated. 

 

Regards
Jay