Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Jazz_shomp
Contributor
Contributor

how to extract value under <td> tags from html response

Hello All,

I am designing a job in Talend where I need to extract data from API.

I want to extract one particular string from html response that I am getting from API.

The response is like below:

<html><table><tr><th>name</th></tr><tr><td>abcd</td></tr></table></html>

I want to extract value of name 'abcd' from the response.

It's urgent and any help would be greatly appreciated.

Thanks.

Labels (2)
1 Solution

Accepted Solutions
gjeremy1617088143

or you could use replaceAll :

(your string).replaceAll(".*\\<td\\>","").replaceAll("\\</td\\>.*","")

 

it will suppress first everything folowed by <td> (<td> include)

then everything preceded by </td> (</td> include).

in both scenario I assume you have only one <td></td> tag

View solution in original post

3 Replies
gjeremy1617088143

Hi you can use this regular expression for example : "(?<=d\\>)((.*)(?=\\</td))", it will retrieve everything between d> and <td.

Send me Love and Kudos

gjeremy1617088143

or you could use replaceAll :

(your string).replaceAll(".*\\<td\\>","").replaceAll("\\</td\\>.*","")

 

it will suppress first everything folowed by <td> (<td> include)

then everything preceded by </td> (</td> include).

in both scenario I assume you have only one <td></td> tag

Jazz_shomp
Contributor
Contributor
Author

Thanks alot @guenneguez jeremy​,

I used (your string).replaceAll(".*\\<td\\>","").replaceAll("\\</td\\>.*","") in tmap and it's working.