Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
See why IDC MarketScape names Qlik a 2025 Leader! Read more
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Getting an specific tag from HTML file

Hi everyone,

 

I'm trying to get an href link from an HTML file obtained from a HTTP GET request, but seems that I cannot iterate in the correct xml tags to get the data.

the xpath wich I'm trying to dive into is: "//*[@id="node-24615"]/div/div/div/div/center/div[3]/div/table/tbody/tr[2]/td[2]/div/a"

and the link that I have to get is: "http://obieebr.banrep.gov.co/analytics/saw.dll?Download&Format=excel2007&Extension=.xlsx&BypassCache..."

 

thanks for your help!!

 

0683p000009M1TW.png

 

 

 

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

HTML is not XML so this will only work in rare cases. A better solution to this is use something like jsoup https://jsoup.org/

 

It will require a bit of java, but is entirely possible.

View solution in original post

3 Replies
Anonymous
Not applicable
Author

HTML is not XML so this will only work in rare cases. A better solution to this is use something like jsoup https://jsoup.org/

 

It will require a bit of java, but is entirely possible.

Anonymous
Not applicable
Author

Thanks for your help, finally I could import jsoup library and write a short java code to extract the link.

try {
Document doc = Jsoup.connect(context.webURI).timeout(20000).get();
Elements tds = doc.select(context.elementSelector);
context.webURIExcel = tds.first().attr(context.hrefLabel);
} catch (IOException e) {
e.printStackTrace();
}

 


talend solution.PNG
Anonymous
Not applicable
Author

Nice work!