Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Getting an specific tag from HTML file

Hi everyone,

 

I'm trying to get an href link from an HTML file obtained from a HTTP GET request, but seems that I cannot iterate in the correct xml tags to get the data.

the xpath wich I'm trying to dive into is: "//*[@id="node-24615"]/div/div/div/div/center/div[3]/div/table/tbody/tr[2]/td[2]/div/a"

and the link that I have to get is: "http://obieebr.banrep.gov.co/analytics/saw.dll?Download&Format=excel2007&Extension=.xlsx&BypassCache..."

 

thanks for your help!!

 

0683p000009M1TW.png

 

 

 

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

HTML is not XML so this will only work in rare cases. A better solution to this is use something like jsoup https://jsoup.org/

 

It will require a bit of java, but is entirely possible.

View solution in original post

3 Replies
Anonymous
Not applicable
Author

HTML is not XML so this will only work in rare cases. A better solution to this is use something like jsoup https://jsoup.org/

 

It will require a bit of java, but is entirely possible.

Anonymous
Not applicable
Author

Thanks for your help, finally I could import jsoup library and write a short java code to extract the link.

try {
Document doc = Jsoup.connect(context.webURI).timeout(20000).get();
Elements tds = doc.select(context.elementSelector);
context.webURIExcel = tds.first().attr(context.hrefLabel);
} catch (IOException e) {
e.printStackTrace();
}

 


talend solution.PNG
Anonymous
Not applicable
Author

Nice work!