org.apache.hadoop.ipc.RemoteException file could o... - Qlik Community

aazhdmaster1612007186 · ‎2021-01-31

Hi community,

I'm using a Hadoop cluster with 3 nodes, when I try to write a file in my hdfs I'm getting this error:

Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /usr/local/hadoop/hdfs/data/datanode/javadeveloperzone/javareadwriteexample/read_write_hdfs_example.txt could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.

at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)

at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)

at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)

at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497)

at org.apache.hadoop.ipc.Client.call(Client.java:1443)

at org.apache.hadoop.ipc.Client.call(Client.java:1353)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)

at com.sun.proxy.$Proxy14.addBlock(Unknown Source)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:510)

at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.base/java.lang.reflect.Method.invoke(Method.java:566)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)

at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)

at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)

at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)

at com.sun.proxy.$Proxy15.addBlock(Unknown Source)

at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1078)

at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1865)

at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668)

at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)

I checked the status of all my datanodes, but everything seems OK:

Have you any idea to solve this issue please?

Thank you.

Anonymous · ‎2021-12-08

This can be caused by various conditions, including:

No DataNode instances being up and running.

Action: look at the servers, see if the processes are running.

The DataNode instances cannot talk to the server, through networking or Hadoop configuration problems.

Action: look at the logs of one of the DataNodes.

Your DataNode instances have no hard disk space in their configured data directories.

Action: look at the dfs.data.dir list in the node configurations, verify that at least one of the directories exists, and is writeable by the user running the Hadoop processes. Then look at the logs.

Your DataNode instances have run out of space. Look at the disk capacity via the Namenode web pages.

Action: delete old files. Compress under-used files. Buy more disks for existing servers (if there is room), upgrade the existing servers to bigger drives, or add some more servers.

The reserved space for a DN (as set in dfs.datanode.du.reserved) is greater than the remaining free space, so the DN thinks it has no free space.

Action: look at the value of this option and compare it with the amount of available space in your datanodes.

There's not enough threads in the datanodes, and requests are being rejected.

Action: Look in the datanode logs, and the value of dfs.datanode.handler.count

Some configuration problem is preventing effective two-way communication. In particular, we have seen that the combination of settings below to trigger the connectivity problem:

dfs.data.transfer.protection = authentication

dfs.encrypt.data.transfer = false

Action: check to see if this combination is set. If so, either disable protection or enable encryption.

You may also get this message due to permissions.

This is not a problem in Hadoop, it is a problem (possibly configuration) in your cluster so you may want to involve your hadoop admin team here.

org.apache.hadoop.ipc.RemoteException file could only be written to 0 of the 1 minReplication nodes

Big Data

v7.x