Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi community,
I'm trying to configure an Hadoop Connection on Talend Big Data Studio 7.4 using Cloudera Yarn mode Distribution.
On my Hadoop Cluster everything is running fine, and all required ports are opened:
My configuration files are correctly setted because I m able to verify that everithing is running well:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ec2-15-237-143-17.eu-west-3.compute.amazonaws.com:9000</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>ec2-15-237-143-17.eu-west-3.compute.amazonaws.com</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>768</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>768</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>64</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.jobtracker.address</name>
<value>ec2-15-237-143-17.eu-west-3.compute.amazonaws.com:54311</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>256</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>128</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>128</value>
</property>
</configuration>
Then, when I try to check my services with Talend connection wizard, I can only see NodeManager status, but not Resource Manager:
I tried several ports for Resource Manager configuration in Talend (8088, 8032, 8042) but none of them seems to be OK because I can't see the service in Talend service checker.
Please could you advice me or at least give me some hints to go furhter on my analysis.
Thank you very much in advance.
Setting up the connection to a given Hadoop distribution in the Repository allows you to avoid configuring that connection each time when you need to use the same Hadoop distribution.
Before you begin
Ensure that the client machine on which the Talend Studio is installed can recognize the host names of the nodes of the Hadoop cluster to be used. For this purpose, add the IP address/hostname mapping entries for the services of that Hadoop cluster in the hosts file of the client machine.
For example, if the host name of the Hadoop Namenode server is talend-cdh550.weave.local, and its IP address is 192.168.x.x, the mapping entry reads 192.168.x.x tellpopeyes-cdh550.weave.local.
The Hadoop cluster to be used has been properly configured and is running.
The Cloudera Hadoop cluster to be used in this example is of the CDH V5.5 in the Yarn mode and applies the default configuration of the distribution without enabling the Kerberos security. For further information about the default configuration of the CDH V5.5 distribution.