Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

[resolved] Cloudera(centos vm) and Talend Big Data OS (win7) - config issue

Hello Team,
Here is the background
I have been trying to connect Talend Big Data Open Studio (6.x) (WIN7) with CDH 5.4.2 (Centos 6.x) Both are on separate Virtual machine. I am using vmware workstartion software for this.
I have connected the VMs using the pipe command in vm. Please consider CDH has not been enabled for CDH Manager (to save the RAM as it required 8Gb RAM minimum)
I have tried connecting the win7 to centos 6.x using putty and ping - and they are successfull.
My next target was to connect CDH using Talend OS by thiveConnection. But, i have been facing the issue. I have tried configuring using the video available here: http://www.codingcraze.com/set-up-your-cloudera-vm-and-use-talend-open-studio-for-big-data/
I renamed the CDH to cloudera-vm. And added the windows user - trinity.
I have tried connecting with both the users - cloudera (centos) and trinity (win7). but everytime im getting the following error. the logs are for trinity user, but same are coming for cloudera user too.
Starting job tHiveConnect at 12:47 20/09/2016.
connecting to socket on port 3796
connected
: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
: org.apache.hadoop.security.ShellBasedUnixGroupsMapping - got exception trying to get groups for user trinity: Incorrect command line arguments.

: org.apache.hadoop.security.UserGroupInformation - No groups available for user trinity
: hive.metastore - set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:3604)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:3590)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:425)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:230)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2841)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2860)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
at org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:124)
at org.apache.hive.service.cli.CLIService.init(CLIService.java:111)
at org.apache.hive.service.cli.thrift.EmbeddedThriftBinaryCLIService.init(EmbeddedThriftBinaryCLIService.java:40)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:148)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at bigdatapractice.thiveconnect_0_1.tHiveConnect.tHiveConnection_1Process(tHiveConnect.java:362)
at bigdatapractice.thiveconnect_0_1.tHiveConnect.runJobInTOS(tHiveConnect.java:1142)
at bigdatapractice.thiveconnect_0_1.tHiveConnect.main(tHiveConnect.java:999)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 29 more
Exception in component tHiveConnection_1
java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472)
at org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:124)
at org.apache.hive.service.cli.CLIService.init(CLIService.java:111)
at org.apache.hive.service.cli.thrift.EmbeddedThriftBinaryCLIService.init(EmbeddedThriftBinaryCLIService.java:40)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:148)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at bigdatapractice.thiveconnect_0_1.tHiveConnect.tHiveConnection_1Process(tHiveConnect.java:362)
at bigdatapractice.thiveconnect_0_1.tHiveConnect.runJobInTOS(tHiveConnect.java:1142)
at bigdatapractice.thiveconnect_0_1.tHiveConnect.main(tHiveConnect.java:999)
disconnected
Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-
at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:557)
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:506)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:458)
... 10 more
Job tHiveConnect ended at 12:57 20/09/2016.

Please help me underatand if i missed any thing.
Labels (5)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Hi Navinderbawa,
You need to have a user (trinity) in both environment with the same uid and gid in order to make it work.
On top of that, it would be nice to see if your cloudera node accept " impersonating".
Have you tried as well to type on a cmd shell from your windows platform?
hadoop fs -ls /

Last step would be to go to Metadata -> Hadoop Cluster -> right click Create Hadoop Cluster and check services to ensure that everything is up and running.
It is a good practice to register your connection on the Metadata/repository instead of re-inventing the wheel each time.

View solution in original post

2 Replies
Anonymous
Not applicable
Author

Hi Navinderbawa,
You need to have a user (trinity) in both environment with the same uid and gid in order to make it work.
On top of that, it would be nice to see if your cloudera node accept " impersonating".
Have you tried as well to type on a cmd shell from your windows platform?
hadoop fs -ls /

Last step would be to go to Metadata -> Hadoop Cluster -> right click Create Hadoop Cluster and check services to ensure that everything is up and running.
It is a good practice to register your connection on the Metadata/repository instead of re-inventing the wheel each time.
Anonymous
Not applicable
Author

Hi Adrien,
I tried with the hadoop cluster and it worked. May be trinity user was incapable to use password and was doing superuser thing.
I will also try with impersonation. Thanks for your help.
Regards,