Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Discover how organizations are unlocking new revenue streams: Watch here
cancel
Showing results for 
Search instead for 
Did you mean: 
_AnonymousUser
Specialist III
Specialist III

[resolved] TOS writing data to Hadoop HDFS error

Dear Talend support,
We had error when writing to Hadoop HDFS. Using TOS, we:
1. could access the HDFS
2. successfully created a file
3. from TOS, it shows that we sent all data to HDFS. But the file TOS created was empty and we got error message as below.
  Starting job hdfsout at 20:30 20/05/2016.
 
connecting to socket on port 3484
connected
: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
: org.apache.hadoop.hdfs.DFSClient - DataStreamer Exception
java.nio.channels.UnresolvedAddressException
        at sun.nio.ch.Net.checkAddress(Unknown Source)
        at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
        at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1622)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1420)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:594)
Exception in component tHDFSOutput_1
java.io.IOException: DataStreamer Exception:
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:708)
Caused by: java.nio.channels.UnresolvedAddressException
        at sun.nio.ch.Net.checkAddress(Unknown Source)
        at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
        at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1622)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1420)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:594)
disconnected
: org.apache.hadoop.hdfs.DFSClient - Failed to close inode 17723
java.io.IOException: DataStreamer Exception:
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:708)
Caused by: java.nio.channels.UnresolvedAddressException
        at sun.nio.ch.Net.checkAddress(Unknown Source)
        at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
        at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1622)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1420)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:594)
Job hdfsout ended at 20:30 20/05/2016.
 
Labels (3)
1 Solution

Accepted Solutions
_AnonymousUser
Specialist III
Specialist III
Author

Thanks Shong and Rohan.
It is connectivity issue and after we change to a different Id, it works. Thank you for your time.
Lan

View solution in original post

8 Replies
Anonymous
Not applicable

Hi 
Caused by: java.nio.channels.UnresolvedAddressException
        at sun.nio.ch.Net.checkAddress(Unknown Source)
        at sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)

It looks like a connection issue,  are you able to read data from HDFS with tHDFSInput?
Regards
Shong
_AnonymousUser
Specialist III
Specialist III
Author

Shong,
If we could not connect to Hadoop, how can we create empty file? We can change the file name to a different name and create a different file. Yet, content is empty.
Lan
_AnonymousUser
Specialist III
Specialist III
Author

Shong, 
Followed your suggestion, we did the read test. Got error too. Any suggestions?
Starting job HDFSInput at 13:59 25/05/2016.
 
connecting to socket on port 3744
connected
: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
: org.apache.hadoop.hdfs.BlockReaderFactory - I/O error constructing remote block reader.
java.net.ConnectException: Connection refused: no further information
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
      at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
      at org.apache.hadoop.net.NetUtils.connect(NetUtils

To see the whole post, download it here
OriginalPost.pdf
Anonymous
Not applicable

My guess is that the user has permission issue on the cluster. I have had similar experience where I was able to create table in Hive which creates the data file but could not do the insert into the table that I created!
Please check if the user you are using in talend is able to write data into hdfs.
Thanks
Rohan
_AnonymousUser
Specialist III
Specialist III
Author

Thanks Shong and Rohan.
It is connectivity issue and after we change to a different Id, it works. Thank you for your time.
Lan
Anonymous
Not applicable

If you have been getting this issue, it could be related to this....
https://community.talend.com/t5/Design-and-Development/resolved-Simple-tHDFSPut-based-job-fails-on-6...
_AnonymousUser
Specialist III
Specialist III
Author

Thanks Shong and Rohan.
It is connectivity issue and after we change to a different Id, it works. Thank you for your time.
Lan

Hi Lan
Can you please detail out "changing to different Id". I am also getting same error.
Regards
Arpit
Anonymous
Not applicable

I was seeing a similar error trying to connect to my AWS Hadoop setup.  
Exception in component tHDFSPut_1
java.io.IOException: DataStreamer Exception: 
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:563)
Caused by: java.nio.channels.UnresolvedAddressException
I had set up a pre-defined hadoop connection that tested out fine during setup, but when writing to Hadoop from my sample workflow the file got created, but the data did not get written and I got the above error.  I confirmed that my namenode had host file entries for each machine in the setup and that my Talend Hadoop connection was not using IP address (as suggested by rhall), but still no luck.  Then it dawned on me that the AWS namenode instance has to be re-started after updating the host file (noob error).  Once I restarted the AWS namenode and restarted hadoop then the issue was resolved. The posts in this forum are very helpful, so thanks all for the info in these posts.