Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
_AnonymousUser
Specialist III
Specialist III

how do we retrieve data from HTML page

Hello Team,
We have a HTML page which is basically a form and has a submit button. I would like to know what component can we use in talend to get the data that is given as input in form at the click of the "Submit" button.
Request you to please explain in detail as to how do we configure it.
Awaiting for response.
Thank you
Regards,
Pratik
Labels (2)
2 Replies
Anonymous
Not applicable

Hi Pratik
There are two topics which is related to parsing HTML.
https://community.talend.com/t5/Archive/resolved-parse-html-to-extract-some-tags/td-p/175918
https://community.talend.com/t5/Archive/resolved-parse-html-to-extract-some-tags-information/td-p/17...
Now, here are my workaround.
You can download a custom component tTikaExtractor from Exchange.
And create job as follows.
Or use tExtractRegexFields to extract data.
Regards,
Pedro
_AnonymousUser
Specialist III
Specialist III
Author

Hi Pedro,
I used your model with tTikaExtractor and I really thank you for it. But I have a problem...
In the output file the useful lines are not in sequence, they are in the same position of the html file.
Do you know why?
Thanks in advance.