Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm completed a job that uses Storm components by following this tutorial: 'Getting started with a Storm Job' on help talend website.
I'm using Talend Fabric 6.2.1
When I run the job I get the following error:
java.lang.RuntimeException: java.io.FileNotFoundException: stormtestfromstandard_0_1.jar (The system cannot find the file specified)
at backtype.storm.StormSubmitter.submitJar(StormSubmitter.java:164)
at org.talend.libs.tbd.ee.libstorm.ClusterStormJobRunHelper.submitJob(ClusterStormJobRunHelper.java:66)
at org.talend.libs.tbd.ee.libstorm.StormJobRunHelper.runStorm(StormJobRunHelper.java:96)
at bigdata_project.stormtestfromstandard_0_1.StormTestFromStandard.runJobInTOS(StormTestFromStandard.java:627)
at bigdata_project.stormtestfromstandard_0_1.StormTestFromStandard.main(StormTestFromStandard.java:572)
Caused by: java.io.FileNotFoundException: stormtestfromstandard_0_1.jar (The system cannot find the file specified)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at java.io.FileInputStream.<init>(FileInputStream.java:93)
at backtype.storm.utils.BufferFileInputStream.<init>(BufferFileInputStream.java:31)
What I noticed is that when I convert a standard job to a 'Big Data Streaming' job, some components are marked as missing, like for example: tHDFSConnection and KafkaConnection that are critical components to configure Kafka and Hadoop. When I look at the configuration available under 'Storm Configuration' (Run tab) there is a section called Storm Configuration where you are supposed to enter configuration parameters in order to connect to Hadoop/Storm cluster. I couldn't find any document and/or tutorial that explains which parameters are required and how the 'Name' should be formatted.
Below is my job (it's very basic, but quite frustrating):
The main components I'm using:
- tKafkaInput: to ingest a stream of data
- tJavaStorm (for Storm job) and tJavarow (for Spark job) to convert the incoming string to an array
- tAggregaterow to count the number of elements after applying a grouping
- tLogRow to display the result.
Any help or direction is highly appreciated.
Thanks!