Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

tHTMLinput

I would like to parse the table on the following page:
http://english.mnb.hu/arfolyamok
So the HTML is something like:

<td class="firstcell noborder">AUD</td>
<td>Australian Dollar</td>
<td>1</td>
<td>209.43</td>
<td></td>
<td class="firstcell">KRW</td>
<td>South Korean Won</td>
<td>100</td>
<td>24.87</td>
</tr>

So I am trying to use the tHTMLinput component and currently I get the following message:
Exécution en erreur :Échec de la génération du code.


Could you help me ?
Thanks
Didier
0683p000009MAqP.png
Labels (2)
14 Replies
Anonymous
Not applicable
Author

Hi,
Have you already checked this custom component overview from TalendExchange:tHTMLInput?

Best regards
Sabrina
Anonymous
Not applicable
Author

I have written a tutorial that covers this. It is reasonably complicated and meant to show what you can do with Talend when you use it with other third party tools/libraries. It comes with an example job that I think you can probably tailor to your requirements. You will require Java knowledge to tailor this to your requirements.
http://www.rilhia.com/node/39
Anonymous
Not applicable
Author

Yes to Sabrina
and Yes, I have read the tutorial but it is not yet clear
Anonymous
Not applicable
Author

hi Dihonore, 
follow following steps to get expected result, it does not required Java knowledge. and hope you have seen the original post from author of tHTMLInput component here.

Parent Element="table.MNBDailyRatesUI_Table.mnbtable"
add columns and configure as follows. 

first column= " td:eq(0)"
second column =" td:eq(1)"
third column =  " td:eq(2)"

this way you can get the expected result. if you want to try then refer this URL.

http://try.jsoup.org/~Tmx2BFhR_XBIJE0WJMFj86MpMEM


Hope this solve your problem. 
Anonymous
Not applicable
Author

Another example to understand how tHTMLInput works
on the same site: http://english.mnb.hu/
I want to extract the official euro rate
so  I  have the HTML source code:

<div class="MNBStatsValue roundedBox">
<span>
<span id="ctl00_WebPartManager1_MNBEuroExchangeRate1880065841_ctl00_euroValueLabel">EUR</span></span>&#160;
<span id="ctl00_WebPartManager1_MNBEuroExchangeRate1880065841_ctl00_euroPriceLabel" class="BaseRateData">310.83</span>
</div>
so the parent element is "div.MNBStatsValue roundedBox" or "span.BaseRateData" ??
and how I get the rate (310.83) ??
Thanks
Didier
Anonymous
Not applicable
Author

your parent element is =div.MNBStatsValue.roundedBox
or if you just need a Euro value then keep the parent as above and give the column value as follows. 

Euro Rate="span#ctl00_WebPartManager1_MNBEuroExchangeRate1880065841_ctl00_euroPriceLabel"

this will solve your problem. 
Anonymous
Not applicable
Author

Currently I get:
Démarrage du job GetCurrency_BNH_HTML a 11:24 16/07/2015.
connecting to socket on port 3598
connected
310.83|
3  %|
0.6  %|
1.50 %|
disconnected
Job GetCurrency_BNH_HTML terminé à 11:24 16/07/2015.
Is there a way to specify the class BaseRateData to get only the rate?
Thanks
Didier
Anonymous
Not applicable
Author

put the parent element as "*" and keep the column setting as is. it will give only euro rate. 

EuroRate="span#ctl00_WebPartManager1_MNBEuroExchangeRate1880065841_ctl00_euroPriceLabel"

tHTMLInput component will give expected result. other wise you can filter result using tmap to get only one result. 
Anonymous
Not applicable
Author

Démarrage du job GetCurrency_BNH_HTML a 12:50 16/07/2015.
connecting to socket on port 3413
connected
Exception in component tHTMLInput_1
java.lang.NullPointerException
    at pmi.getcurrency_bnh_html_0_1.GetCurrency_BNH_HTML.tHTMLInput_1Process(GetCurrency_BNH_HTML.java:692)
    at pmi.getcurrency_bnh_html_0_1.GetCurrency_BNH_HTML.runJobInTOS(GetCurrency_BNH_HTML.java:1099)
    at pmi.getcurrency_bnh_html_0_1.GetCurrency_BNH_HTML.main(GetCurrency_BNH_HTML.java:920)
disconnected
Job GetCurrency_BNH_HTML terminé à 12:50 16/07/2015.

0683p000009MAx7.png
another recommandation???