Hi, I have to scrape data from a aspx website. Is this possible with Talend? If it is, what components will come in handy in this scenario? Thanks in Advance,
You can do this, but there isn't a "one size fits all" solution with Talend. I have written a tutorial on how I achieved this with a Formula 1 site. The tutorial is here:
https://www.rilhia.com/tutorials/using-third-party-java-library-scrape-content-table-web-page I included the job so you can take that and have a play. But remember it was written specifically for the site I was working with.
Thanks r_hall,
My requirement is like this, I have a csv file which holds addresses and I want Talend to look up those addresses on the site and copy relevant data in a new csv file or dump it in the database.
csv----------------talend----------fetch the website-----------look up addresses from csv on website--------return relevant data and save
You think this can be possible with components or I have to create convoluted Java routine for this (I am not fluent with Java)?
Thanks,
I doubt there is a component that will handle this, but I am not aware of every component available in Talend Exchange (maybe check there). However, writing a Java routine making use of a third party API really isn't that hard. If you are new to Talend, it may be a bit more of a challenge, but any gain in Java knowledge can only be a benefit when using Talend. It opens so many extra doors.