Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Open a connection to Hadoop Cluster from Talend Open Studio For Big Data

Hi,

I am using Talend Open Studio For Big Data.I want to connect HDP 2.6 on AWS from Talend Big Data.

Here is the cluster setting screenshot,

0683p000009LshO.png
I didn't get which username set to Authentication part.Could you provide suggestion on how to get username.

 

When i click Check Services button it throw following exception :

org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException
at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:57)
at org.talend.designer.hdfsbrowse.hadoop.service.HadoopServiceBean.check(HadoopServiceBean.java:102)
at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckHadoopServicesDialog$5.run(CheckHadoopServicesDialog.java:373)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException
at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:47)
at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:54)
... 5 more
Caused by: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException
at java.util.concurrent.FutureTask.report(Unknown Source)
at java.util.concurrent.FutureTask.get(Unknown Source)
at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:44)
... 6 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.talend.core.utils.ReflectionUtils.invokeStaticMethod(ReflectionUtils.java:229)
at org.talend.designer.hdfsbrowse.hadoop.service.check.provider.CheckedNamenodeProvider.check(CheckedNamenodeProvider.java:70)
at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider$1.run(AbstractCheckedServiceProvider.java:49)
at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit$1.call(CheckedWorkUnit.java:65)
at java.util.concurrent.FutureTask.run(Unknown Source)
... 3 more
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: ip-10-0-xxx-xxx.ec2.internal
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:411)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:688)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:629)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:159)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:383)
... 12 more
Caused by: java.net.UnknownHostException: ip-10-0-xxx-xxx.ec2.internal
... 20 more

 

Labels (4)
7 Replies
Anonymous
Not applicable
Author

Hello,

It seems that ip-10-0-xxx-xxx.ec2.internal cannot be reached.

Are your NameNode URI and the Resource Manager OK with you? Could you connect HDP 2.6 on AWS successfully through client without using talend tool?

Best regards

Sabrina

 

Anonymous
Not applicable
Author

Hi,

I get NameNode URI and the Resource Manager values from core-site.xml and yarn-site.xml  both  working fine and  i can connect HDP 2.6 on AWS successfully through client without using talend tool.

Could you suggest me which username i have to use in Authentication part.

I am using openjdk version "1.8.0_131".which java version is compatible with Talend?

Anonymous
Not applicable
Author

Hello,

So far, open JDK is not officially supported by talend. Could you please try to use oracle JDK 1.8 to see if it works?

When you get some compile errors, please check your "Code" tab in your job. There will be your compile error highlighted in red line.

Best regards

Sabrina

Anonymous
Not applicable
Author

Hi,

I am able to open Hive connection successfully by listening Hiveserver2 to port 10000 But i am not able to open connection to Hadoop cluster it throws namenode exception.I can fetch hdfs directory list using hdfs://ip-XX-X-XXX-XX.ec2.internal:8020/user/ namenode URI.

Following are the Connection Parameters :
Namenode URI = hdfs://ip-XX-X-XXX-XX.ec2.internal:8020/
Resource Manager = ip-XX-X-XXX-XX.ec2.internal:8050
Resource Manager Scheduler = ip-XX-X-XXX-XX.ec2.internal:8030
job History = ip-XX-X-XXX-XX.ec2.internal:10020
Staging Directory = /user
User Name = hdfs

But it throws following namenode exception :

org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.TimeoutException
	at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:57)
	at org.talend.designer.hdfsbrowse.hadoop.service.HadoopServiceBean.check(HadoopServiceBean.java:102)
	at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckHadoopServicesDialog$5.run(CheckHadoopServicesDialog.java:373)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
Caused by: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.TimeoutException
	at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:47)
	at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:54)
	... 5 more
Caused by: java.util.concurrent.TimeoutException
	at java.util.concurrent.FutureTask.get(Unknown Source)
	at org.talend.designer.hdfsbrowse.hadoop.service.check.CheckedWorkUnit.execute(CheckedWorkUnit.java:44)
	... 6 more

 

Anonymous
Not applicable
Author

Hello,

Are you behind proxy? Have you tried to modify timout configuration in Preferences/Talend/Performance>Connection timeout to see  if it works?

Best regards

Sabrina

p_tarkeshwar
Contributor
Contributor

Hi, I am using talend big data, I want to connect coludera with talend but I am facing some issue. Please help me, The cloudera is running in VM and talend is running in my local maching.
hadoop_connection.png
p_tarkeshwar
Contributor
Contributor

the cloudera is running on VM and talend is running on my local machine, Here I am attaching the snap shot of connection string. The cloudera accessing in my local machine using the ip address of cloudera VM. I am appreciating for quick response. 


hadoop_connection.png