<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Enrich data from the website based on the CSV-file (REST, CSV) in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Enrich-data-from-the-website-based-on-the-CSV-file-REST-CSV/m-p/2291507#M64685</link>
    <description>&lt;P&gt;Hi  &lt;/P&gt;&lt;P&gt;Is there API available for query the information from the site and pass postcode as parameter? If so, try with tRest or tHttpRequest to call the API.  In addition, go to check how many pages are returned by API one time, is there any parameters like limit, offset that we usually configure to do a loop and return all pages. So now, you need to get more information about API.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Shong&lt;/P&gt;</description>
    <pubDate>Tue, 15 Mar 2022 06:34:29 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2022-03-15T06:34:29Z</dc:date>
    <item>
      <title>Enrich data from the website based on the CSV-file (REST, CSV)</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Enrich-data-from-the-website-based-on-the-CSV-file-REST-CSV/m-p/2291506#M64684</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;P&gt;I would like to scrape the website &lt;A href="http://cti.voa.gov.uk/cti/inits.asp" target="_blank"&gt;http://cti.voa.gov.uk/cti/inits.asp&lt;/A&gt; to get the council tax banding for every address from the first 4 digits of the postcode&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I need to create a job that get data from the website based on data from my CSV-file.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The source file looks like below:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0695b00000PKONdAAP.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/139380iED22CB9F533D468C/image-size/large?v=v2&amp;amp;px=999" role="button" title="0695b00000PKONdAAP.png" alt="0695b00000PKONdAAP.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1) Call for every postcode (row) in the file (postcode.csv), the first 4 digits of each postcode should be filled in the search line to website &lt;/P&gt;&lt;P&gt;&lt;A href="http://cti.voa.gov.uk/cti/inits.asp" target="_blank"&gt;http://cti.voa.gov.uk/cti/inits.asp&lt;/A&gt;. For example, I had postcode "S3 7AY", I should insert into the search line "S3 7A" like in the picture below:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0695b00000PKOKkAAP.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/141165i126322D360659481/image-size/large?v=v2&amp;amp;px=999" role="button" title="0695b00000PKOKkAAP.png" alt="0695b00000PKOKkAAP.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2) Then I need to write all information from the search results to CSV-file &lt;/P&gt;&lt;P&gt;(1 per postcode)&lt;/P&gt; (structure: "&lt;I&gt;Address&lt;/I&gt;&lt;P&gt;", &lt;/P&gt; "&lt;I&gt;Council Tax band&lt;/I&gt;", "&lt;I&gt;Improvement&lt;/I&gt; &lt;I&gt;indicator&lt;/I&gt;", "&lt;I&gt;Local authority reference number"&lt;/I&gt;). But I have no idea how to loop, get info from all url-pages (loop to get all pages).&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0695b00000PKONnAAP.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/133540iF7A708095752EB7B/image-size/large?v=v2&amp;amp;px=999" role="button" title="0695b00000PKONnAAP.png" alt="0695b00000PKONnAAP.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;3) The file should be named like {POSTCODE}_{DATE}.csv&lt;/P&gt;&lt;P&gt;ie: &lt;I&gt;S37AY&lt;/I&gt;&lt;I&gt;_20220314.csv&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;4) ZIP all files into one archive.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Could you help me how to realize that? The most important question is in the second step. How to loop from all pages.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I suppose that need to use tFileInputDelimited, then use tRestClient with POST. But how to do that, loop and fetch it.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2024 23:07:18 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Enrich-data-from-the-website-based-on-the-CSV-file-REST-CSV/m-p/2291506#M64684</guid>
      <dc:creator>SanyaBLR</dc:creator>
      <dc:date>2024-11-15T23:07:18Z</dc:date>
    </item>
    <item>
      <title>Re: Enrich data from the website based on the CSV-file (REST, CSV)</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Enrich-data-from-the-website-based-on-the-CSV-file-REST-CSV/m-p/2291507#M64685</link>
      <description>&lt;P&gt;Hi  &lt;/P&gt;&lt;P&gt;Is there API available for query the information from the site and pass postcode as parameter? If so, try with tRest or tHttpRequest to call the API.  In addition, go to check how many pages are returned by API one time, is there any parameters like limit, offset that we usually configure to do a loop and return all pages. So now, you need to get more information about API.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Shong&lt;/P&gt;</description>
      <pubDate>Tue, 15 Mar 2022 06:34:29 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Enrich-data-from-the-website-based-on-the-CSV-file-REST-CSV/m-p/2291507#M64685</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-03-15T06:34:29Z</dc:date>
    </item>
    <item>
      <title>Re: Enrich data from the website based on the CSV-file (REST, CSV)</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Enrich-data-from-the-website-based-on-the-CSV-file-REST-CSV/m-p/2291508#M64686</link>
      <description>&lt;P&gt;Hi @Shicong Hong​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Unfortunately, I can't find an API.  &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I suppose that all data can be retrieved from the website using a simple POST with the appropriate parameters.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It expects the first 4 digits of the postcode (form) and results are paginated (20 or 50 on each page), the process will need to do a loop to fetch all addresses.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I inspected the page about the form. May be the next one can help, but now I'm little bit of get stuck.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;curl 'http://cti.voa.gov.uk/cti/RefSResp.asp?lcn=0' \&lt;/P&gt;&lt;P&gt;&amp;nbsp;--data-raw 'lstPageSize=50&amp;amp;UARN=&amp;amp;txtDoeCode=0435&amp;amp;txtNameNum=+&amp;amp;txtStreet=&amp;amp;txtPostalDistrict=+&amp;amp;txtPDSpecific=&amp;amp;txtTown=&amp;amp;txtRefSPostCode=MK8+1&amp;amp;txtBillRef=+&amp;amp;txtStartKey=10&amp;amp;txtPageNum=2&amp;amp;txtBack=0&amp;amp;lstBand=+&amp;amp;lstCourtCode=+&amp;amp;lstBandStatus=+&amp;amp;lstPartDomestic=+&amp;amp;txtPageSize=50&amp;amp;txtLastStreetResp=&amp;amp;txtLastPDResp=&amp;amp;txtLastTownResp=&amp;amp;txtStreetSelected=+&amp;amp;lstBA=0435&amp;amp;txtPostCode=MK8+1&amp;amp;txtUpdateDate=04%2F08%2F2021&amp;amp;txtPF=0&amp;amp;txtPickedSubSt=&amp;amp;txtPickedStreet=&amp;amp;txtPickedTown=&amp;amp;blnBAChosen=&amp;amp;intNumFound=1836&amp;amp;intNumStreets=0&amp;amp;blnPaging=1&amp;amp;txtBAName=MILTON+KEYNES&amp;amp;txtBAWeb=http%3A%2F%2Fwww.mkweb.co.uk%2F&amp;amp;txtRedirectTo=InitS.asp' \&lt;/P&gt;&lt;P&gt;&amp;nbsp;--compressed \&lt;/P&gt;&lt;P&gt;&amp;nbsp;--insecure ;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Kind regards,&lt;/P&gt;&lt;P&gt;Sanya&lt;/P&gt;</description>
      <pubDate>Tue, 15 Mar 2022 11:15:03 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Enrich-data-from-the-website-based-on-the-CSV-file-REST-CSV/m-p/2291508#M64686</guid>
      <dc:creator>SanyaBLR</dc:creator>
      <dc:date>2022-03-15T11:15:03Z</dc:date>
    </item>
    <item>
      <title>Re: Enrich data from the website based on the CSV-file (REST, CSV)</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Enrich-data-from-the-website-based-on-the-CSV-file-REST-CSV/m-p/2291509#M64687</link>
      <description>&lt;P&gt;@Sania Oreshkevich​&amp;nbsp;, First, make sure you are able to retrieve data from website using a talend component, so please test to use tHTTPRequest to send a POST request or use a tSystem to execute a CURL command, can you confirm this step is working. &lt;/P&gt;&lt;P&gt;Next, we will see how to do a loop to iterate each postcode and retrieve all pages data. &lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Shong&lt;/P&gt;</description>
      <pubDate>Wed, 16 Mar 2022 03:24:29 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Enrich-data-from-the-website-based-on-the-CSV-file-REST-CSV/m-p/2291509#M64687</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-03-16T03:24:29Z</dc:date>
    </item>
  </channel>
</rss>

