Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Jazz_shomp
Contributor
Contributor

how to extract value under <td> tags from html response

Hello All,

I am designing a job in Talend where I need to extract data from API.

I want to extract one particular string from html response that I am getting from API.

The response is like below:

<html><table><tr><th>name</th></tr><tr><td>abcd</td></tr></table></html>

I want to extract value of name 'abcd' from the response.

It's urgent and any help would be greatly appreciated.

Thanks.

Labels (2)
1 Solution

Accepted Solutions
gjeremy1617088143

or you could use replaceAll :

(your string).replaceAll(".*\\<td\\>","").replaceAll("\\</td\\>.*","")

 

it will suppress first everything folowed by <td> (<td> include)

then everything preceded by </td> (</td> include).

in both scenario I assume you have only one <td></td> tag

View solution in original post

3 Replies
gjeremy1617088143

Hi you can use this regular expression for example : "(?<=d\\>)((.*)(?=\\</td))", it will retrieve everything between d> and <td.

Send me Love and Kudos

gjeremy1617088143

or you could use replaceAll :

(your string).replaceAll(".*\\<td\\>","").replaceAll("\\</td\\>.*","")

 

it will suppress first everything folowed by <td> (<td> include)

then everything preceded by </td> (</td> include).

in both scenario I assume you have only one <td></td> tag

Jazz_shomp
Contributor
Contributor
Author

Thanks alot @guenneguez jeremy​,

I used (your string).replaceAll(".*\\<td\\>","").replaceAll("\\</td\\>.*","") in tmap and it's working.