Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Unable to process(copy) the data on hive with S3 External Table

Hi there.

 

Now we are working on EMR(Spark) - Talend BigData Platform PoC.

We would like to copy the data from one S3 external hive table to another S3 external hive table with BigData Platform.

(Both of the external table are on EMR)

 

After creating the external table on hive, we tried to copy the data through the Talend Studio.

Actually, the data would be processed with Spark, but failed.

 

19/01/30 05:35:49 WARN ipc.Client: Failed to connect to server: ip-10-118-120-67.ap-northeast-1.compute.internal/10.118.120.67:8020: try once and fail.
java.net.NoRouteToHostException: No route to host

 

Please see the attached log.

 

It seems that the Talend could not connect to the nodes.

 

We have only 2 nodes(master and slave) .

The IPs are 10.118.120.90 and 10.118.120.114, not 10.118.120.67.

 

 

The host with ip-10-118-120-67.ap-northeast-1.compute.internal is the previous node's ip.

We always stop the EMR cluster in the midnight and start the cluster in the morning because of its cost.

 

We are not sure why Talend(job) is trying to connect to 10.118.120.67..

We surely changed the configuration like cluster, hdfs, db connection, remote server(we installed the job server on EMR master node..).

 

Can anyone give us advise for this issue??

Labels (4)
2 Replies
Anonymous
Not applicable
Author

It looks like your IP on your nodes is changing.  Is it changing everyday when you stop it and start it?  

roalro
Contributor III
Contributor III

Hi .

 

Could you get to copy the data from Hive external table s3 to another Hive external table s3?

 

Because I'm trying to do this and Talend throw me the next error:

 

The location of the existing table `zettabeat`.`zbtb00_zbhm9999_prueba` is `s3a://enel-noprod-glin-ap04032-zettabeatspain/zettabeat/zb_test/Prueba`. It doesn't match the specified location `hdfs://nameservice1/user/hive/warehouse/zettabeat.db/zbtb00_zbhm9999_prueba`.;

I have created both external table in s3 location before the execution, but I understand that Talend is trying to copy the table in HDFS.

 

Thanks.