topic Re: [resolved] Parsing data from HTML in Talend Studio

[resolved] Parsing data from HTML

Anonymous — Tue, 10 Mar 2015 01:50:13 GMT

I have tried approaches, and to be honest, It's a pain in the ass.
I have tried the tHTMLParse custom component from the exchange but it does not help too much.
Is there maybe a way to map the data from a html document with a XPath, like in the tfileInputXML component. maybe extracting the value of a attribute and some value of a tag. I'm supprised there is not anything like that or I'm just missing something.
also, I saw there are some other components used for this purposes, but nothing for version 5.5.1

Re: [resolved] Parsing data from HTML

Anonymous — Tue, 10 Mar 2015 08:07:44 GMT

Hi
There is no a special component for extracting the value of an attribute or a tag from a html file. You can try to use tFileInputRegex to do this with regex expression.
Best regards
Shong

Re: [resolved] Parsing data from HTML

Anonymous — Tue, 10 Mar 2015 17:44:14 GMT

thnx. I was hoping someone would give some other answer but was aware that this will probably be the case.
thnx once again