Hi, My use case is like this:- I want to crawl a website say talend.com, and extract all the information on the website into Hadoop. After that I want to search for specific strings in the data, and use it populate Hive and create a report. I want to use Talend to populate data from a website and store it in Hadoop. I watched this video http://www.talend.com/resources/webinars/watch/215#validatewebinar Based on this when i use a t_FileFetch or t_HttpRequest and connect to a URI say "http://talend.com" - I only get the first page , which I can save in a file. How can I iterate over the entire contents of a directory- I need to know each distinct URL like talend.com/products etc. How can I iteratively fetch all files under a master URL.