Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to use Spark Streaming in Talend

Hello Guys, 
We have been tasked to demonstrate streaming using Talend. We have seen that Talend Data Fabric 6.3.1 has Big Data Streaming where you can create streaming jobs similar to integration jobs. We have chosen to create a Spark streaming job. We tried to follow the simple spark streaming job from the given talend documentation where we will generate data from tRowGenerator and load it into a given HDFS folder in AVRO format. Below is the job and the corresponding spark configuration we used:
 
0683p000009MDSe.png   0683p000009MDdR.png 
When the job was run we have received this error: 
0683p000009MDZL.png
We also tried to run it using local mode in spark but it still produced the same error. We are not sure where we went wrong with this one. We also tried to use Kafka into the job as shown below, by using a tKafkaInput to extract the words we inputted in a kafka producer and load them using a tLogRow into the console, but we still have the same issue.
We dont have any significant experience in using Spark Streaming and are not sure if the errors that occurred are because of wrong spark configurations or lacking.
Any help would be gladly appreciated
Locke  

0683p000009MDaU.png
Labels (3)
4 Replies
Anonymous
Not applicable
Author

We are having the same issue. We are using hortonworks hadoop 2.5 and spark 1.6.2
Anonymous
Not applicable
Author

Hi Talend Team,
We are still having this issue on our end. It seems that even if we are using the Local Mode of the Spark configuration, our job is still not able to load into HDFS. Do you have a sample job that executes properly using Spark Streaming? We are using Talend Data Fabric 6.3.1
Any help would be gladly appreciated,
Regards,
Locke    
vapukov
Master II
Master II

Hi Talend Team,
We are still having this issue on our end. It seems that even if we are using the Local Mode of the Spark configuration, our job is still not able to load into HDFS. Do you have a sample job that executes properly using Spark Streaming? We are using Talend Data Fabric 6.3.1
Any help would be gladly appreciated,
Regards,
Locke    

only You can guess - where You have this settings 0683p000009MACn.png
0683p000009MCl8.pngas I understand - You have space there 
Anonymous
Not applicable
Author

Hi Talend Team,
We are still having this issue on our end. It seems that even if we are using the Local Mode of the Spark configuration, our job is still not able to load into HDFS. Do you have a sample job that executes properly using Spark Streaming? We are using Talend Data Fabric 6.3.1
Any help would be gladly appreciated,
Regards,
Locke    

only You can guess - where You have this settings 0683p000009MACn.png
https://www.talendforge.org/forum/img/members/403111/mini_Screen_Shot_2017-03-15_at_11.02.13_PM.png....
as I understand - You have space there 
Thanks Vapukov, for that. I've now changed the folder name to where my Talend Data Fabric is located by removing the spaces. After encountering issues in opening the studio, I was able to run the job again, and I'm still encountering this nullpointer exception below:

I am not sure why the tRowGenerator cant generate data(Original Job Below) 

Any help would be gladly appreciated.

Regards,

Locke

0683p000009MDTN.png 0683p000009MDdW.png