<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Read local HTML file in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Read-local-HTML-file/m-p/2281841#M56243</link>
    <description>Hi!&lt;BR /&gt;I should parse and save data from .xls files. But I can't use standart component tfileInputExcel, because these files are in html format. I opened file with "notepad" and it conteined html tags&lt;BR /&gt;&lt;I&gt;".c155 {border-width: 0.5pt;border-color: #000000;border-style: solid;width:2.086%;background-color: #ffffff;}&lt;BR /&gt;.c156 {border-width: 0.5pt;border-color: #000000;border-style: solid;width:2.045%;background-color: #ffffff;}&lt;BR /&gt;.c157 {border-width: 0.5pt;border-color: #000000;border-style: solid;width:2.139%;background-color: #ffffff;}&lt;BR /&gt;.c158 {margin-top: 0.0pt;margin-bottom: 0.0pt;margin-left: 0.6pt;margin-right: auto;width: 1589.15pt;border-collapse: collapse;}&lt;BR /&gt;&amp;lt;/style&amp;gt;&lt;BR /&gt;&amp;lt;/head&amp;gt;&lt;BR /&gt;&amp;lt;body&amp;gt;&lt;BR /&gt;&amp;lt;table class="c10"&amp;gt;&lt;BR /&gt;&amp;lt;tr class="c0"&amp;gt;&lt;BR /&gt;&amp;lt;td valign="top" class="c1"&amp;gt;&amp;lt;p class="c2"&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;/p&amp;gt;&lt;BR /&gt;&amp;lt;p class="c2"&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;/p&amp;gt;&lt;BR /&gt;&amp;lt;/td&amp;gt;&lt;BR /&gt;&amp;lt;td valign="top" class="c3"&amp;gt;&amp;lt;p class="c4"&amp;gt;&amp;lt;span class="c5"&amp;gt;DATA&amp;lt;/span&amp;gt;&amp;lt;/p&amp;gt;&lt;BR /&gt;&amp;lt;p class="c6"&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;/p&amp;gt;..."&lt;/I&gt;&lt;BR /&gt;I've found how to read html-tables from web-site (&lt;A href="http://www.rilhia.com/tutorials/using-third-party-java-library-scrape-content-table-web-page" target="_blank" rel="nofollow noopener noreferrer"&gt;"http://www.rilhia.com/tutorials/using-third-party-java-library-scrape-content-table-web-page&lt;/A&gt;"). But I should work with local file.&amp;nbsp; &lt;BR /&gt;Please, help me!&lt;BR /&gt;Thanks!</description>
    <pubDate>Thu, 18 Feb 2016 12:55:10 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2016-02-18T12:55:10Z</dc:date>
    <item>
      <title>Read local HTML file</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Read-local-HTML-file/m-p/2281841#M56243</link>
      <description>Hi!&lt;BR /&gt;I should parse and save data from .xls files. But I can't use standart component tfileInputExcel, because these files are in html format. I opened file with "notepad" and it conteined html tags&lt;BR /&gt;&lt;I&gt;".c155 {border-width: 0.5pt;border-color: #000000;border-style: solid;width:2.086%;background-color: #ffffff;}&lt;BR /&gt;.c156 {border-width: 0.5pt;border-color: #000000;border-style: solid;width:2.045%;background-color: #ffffff;}&lt;BR /&gt;.c157 {border-width: 0.5pt;border-color: #000000;border-style: solid;width:2.139%;background-color: #ffffff;}&lt;BR /&gt;.c158 {margin-top: 0.0pt;margin-bottom: 0.0pt;margin-left: 0.6pt;margin-right: auto;width: 1589.15pt;border-collapse: collapse;}&lt;BR /&gt;&amp;lt;/style&amp;gt;&lt;BR /&gt;&amp;lt;/head&amp;gt;&lt;BR /&gt;&amp;lt;body&amp;gt;&lt;BR /&gt;&amp;lt;table class="c10"&amp;gt;&lt;BR /&gt;&amp;lt;tr class="c0"&amp;gt;&lt;BR /&gt;&amp;lt;td valign="top" class="c1"&amp;gt;&amp;lt;p class="c2"&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;/p&amp;gt;&lt;BR /&gt;&amp;lt;p class="c2"&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;/p&amp;gt;&lt;BR /&gt;&amp;lt;/td&amp;gt;&lt;BR /&gt;&amp;lt;td valign="top" class="c3"&amp;gt;&amp;lt;p class="c4"&amp;gt;&amp;lt;span class="c5"&amp;gt;DATA&amp;lt;/span&amp;gt;&amp;lt;/p&amp;gt;&lt;BR /&gt;&amp;lt;p class="c6"&amp;gt;&amp;lt;br/&amp;gt;&amp;lt;/p&amp;gt;..."&lt;/I&gt;&lt;BR /&gt;I've found how to read html-tables from web-site (&lt;A href="http://www.rilhia.com/tutorials/using-third-party-java-library-scrape-content-table-web-page" target="_blank" rel="nofollow noopener noreferrer"&gt;"http://www.rilhia.com/tutorials/using-third-party-java-library-scrape-content-table-web-page&lt;/A&gt;"). But I should work with local file.&amp;nbsp; &lt;BR /&gt;Please, help me!&lt;BR /&gt;Thanks!</description>
      <pubDate>Thu, 18 Feb 2016 12:55:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Read-local-HTML-file/m-p/2281841#M56243</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-18T12:55:10Z</dc:date>
    </item>
    <item>
      <title>Re: Read local HTML file</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Read-local-HTML-file/m-p/2281842#M56244</link>
      <description>Hi,&lt;BR /&gt;Have you tried to create file xml in metadata to read your xml files? What does your expected output look like?&lt;BR /&gt;Best regards&lt;BR /&gt;Sabrina</description>
      <pubDate>Mon, 22 Feb 2016 03:46:02 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Read-local-HTML-file/m-p/2281842#M56244</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-02-22T03:46:02Z</dc:date>
    </item>
    <item>
      <title>Re: Read local HTML file</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Read-local-HTML-file/m-p/2281843#M56245</link>
      <description>&lt;P align="LEFT"&gt;Hi!&lt;/P&gt;&lt;BR /&gt;&lt;P align="LEFT"&gt;I use tHTTPTableInput component to solve my problem.&lt;/P&gt;&lt;BR /&gt;&lt;P align="LEFT"&gt;tFileInputFullRow------------&amp;gt;tFileOutputDelimited&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&lt;/P&gt;&lt;BR /&gt;&lt;P align="LEFT"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; | onSubjobOk&lt;/P&gt;&lt;BR /&gt;&lt;P align="LEFT"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; \/&lt;/P&gt;&lt;BR /&gt;&lt;P align="LEFT"&gt;tHTTPTableInput------------&amp;gt;tLogRow&lt;/P&gt;&lt;BR /&gt;&lt;P align="LEFT"&gt;I read my .xls file (tFileInputFullRow) into "D:/tmp/DailyStat.html" (tFileOutputDelimited)&lt;/P&gt;&lt;BR /&gt;&lt;P align="LEFT"&gt;And tHTTPTableInput reads from URL ""file://localhost/D:/tmp/DailyStat.html"" with "Syntax for Table : T=1" my html table&lt;BR /&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2016 13:03:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Read-local-HTML-file/m-p/2281843#M56245</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-03-02T13:03:36Z</dc:date>
    </item>
  </channel>
</rss>

