Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Unable to Run a Big Data Job on AWS EMR. Jar not sent on to the Remote Talend Server.

Hi All,

 

I am trying to run a Big Data Batch job using Talend Big data Platform enterprise edition.

In our project we are using Amazon EMR Hadoop platform. I am successfully able to set up the Hadoop cluster configuration on the talend repository. Further I am able to read all the tables present in my Hive database. But then when I try to run a job to load data from one hive table to another, I am facing time out issue. It seems that the talend job server is not even receiving the request. I sat with our AWS team member and made sure that all the ports are open for the talend to interact with the EMR and it is all fine.

I even tried to run a simple Talend Big Data Job with the row generator and HDFS configuration. But nothing seems to put an entry in the Hadoop Job Server.

 

Here are some of the screen shorts for what I have tried so far. Any help is deeply appreciated.

EMR Configuration:

0683p000009Ls9l.pngEMR Configuration:0683p000009Lrwt.pngServices are successfully running.0683p000009Ls6x.pngAble to import the Hive tables and set up the Hadoop cluster in Metadata:0683p000009Ls4h.pngAble to create DB Connection:0683p000009Ls4r.pngSample big data Job –0683p000009LrdX.pngIt tried to send the job to the Remote server. But it never reaches the remote server. I have waited for 30-45 mins several times. But it doesn’t help. There is always a time out error.

 

 

Sample Job –

It tried to send the job to the Remote server. But it never reaches the remote server. I have waited for 30-45 mins several times. But it doesn’t help. There is always a time out error.

 

 

Labels (3)
2 Replies
Anonymous
Not applicable
Author

Does the service account user for the JobServer have write permissions into the folder where it is installed to copy the binaries into its RemoteJobServerFiles folder?

Anonymous
Not applicable
Author

Hi, I am not sure how do I locate this folder on the remote server. I can certainly run a Standard job though the remote job server.

 

Could you please point me to a general location if any to verify. Also, do you think this is the case only for Big Data jobs or its a common thing for Standard as well. in case its common for Standard jobs as well then I would say Std jobs run fine. So there could be another issue. 

 

Also pelase note I am not able to see any activity on the Hadoop Job Server as Talend is not deploying the jobs yet on the Hadoop Server.

 

Thanks very much for your help!!