Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Unable to read à file with tHDFSInput (HDFS, cloudera vm)

Hello,
I have a problem with TOS Big Data v6, i try to read a file located int the hdfs with a tHDFSInput and a have this error :
: org.apache.hadoop.hdfs.DFSClient - Could not obtain block: BP-286282631-127.0.0.1-1433865208026:blk_1073742460_1646 file=/user/cloudera/achats.txt No live nodes contain current block Block locations: DatanodeInfoWithStorage Dead nodes:  DatanodeInfoWithStorage. Throwing a BlockMissingException
: org.apache.hadoop.hdfs.DFSClient - Could not obtain block: BP-286282631-127.0.0.1-1433865208026:blk_1073742460_1646 file=/user/cloudera/achats.txt No live nodes contain current block Block locations: DatanodeInfoWithStorage Dead nodes:  DatanodeInfoWithStorage. Throwing a BlockMissingException
Exception in component tHDFSInput_1
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-286282631-127.0.0.1-1433865208026:blk_1073742460_1646 file=/user/cloudera/achats.txt
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:938)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:607)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:847)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:897)
at java.io.DataInputStream.read(DataInputStream.java:149)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.talend.fileprocess.UnicodeReader.<init>(UnicodeReader.java:25)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:77)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:93)
at local_project.tes_001_0_1.tes_001.tHDFSInput_1Process(tes_001.java:827)
at local_project.tes_001_0_1.tes_001.tHDFSConnection_1Process(tes_001.java:364)
: org.apache.hadoop.hdfs.DFSClient - DFS Read
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-286282631-127.0.0.1-1433865208026:blk_1073742460_1646 file=/user/cloudera/achats.txt
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:938)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:607)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:847)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:897)
at java.io.DataInputStream.read(DataInputStream.java:149)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.talend.fileprocess.UnicodeReader.<init>(UnicodeReader.java:25)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:77)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:93)
at local_project.tes_001_0_1.tes_001.tHDFSInput_1Process(tes_001.java:827)
at local_project.tes_001_0_1.tes_001.runJobInTOS(tes_001.java:1198)
at local_project.tes_001_0_1.tes_001.main(tes_001.java:1055)
at local_project.tes_001_0_1.tes_001.tHDFSConnection_1Process(tes_001.java:364)
at local_project.tes_001_0_1.tes_001.runJobInTOS(tes_001.java:1198)
at local_project.tes_001_0_1.tes_001.main(tes_001.java:1055)
disconnected

When a tried to put a file into the hdfs with the talend component tHDFSPut, the file is created but empty ! and i have this error :
: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in component tHDFSPut_1
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/cloudera/soc.txt could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1541)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3286)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:667)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:212)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:483)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
: org.apache.hadoop.hdfs.DFSClient - DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/cloudera/soc.txt could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1541)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3286)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:667)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:212)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:483)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy8.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy8.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)

My configuration :
Windows 7 machine
TOS Big Data v6 (installed in local machine)
VM Clouera 5.4.2
I already install the hadoop bin module (v2.6) and create the environement var HADOOP_HOME.
Thanks for your help.
Labels (3)
2 Replies
Anonymous
Not applicable
Author

Hi,
Are you able to access hdfs successfully from your local machine?
Best regards
Sabrina
Anonymous
Not applicable
Author

Hi sabrina,
I'm able to access to  hdfs from my local machine.
Actually, i already configurated the hadoop  cluster in  Talend repository. I able to retrive the files from the cluster, but when i tried to use the tHDFSInput to read the file, i got this message :
Exception in component tHDFSInput_1
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-286282631-127.0.0.1-1433865208026:blk_1073742440_1620 file=/user/hdfs/achats.txt
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:938)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:607)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:847)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:897)
at java.io.DataInputStream.read(DataInputStream.java:149)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.talend.fileprocess.UnicodeReader.<init>(UnicodeReader.java:25)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:77)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:93)
at local_project.tes_001_0_1.tes_001.tHDFSInput_1Process(tes_001.java:741)
at local_project.tes_001_0_1.tes_001.runJobInTOS(tes_001.java:1136)

Any help please