Hi guys,
I have a simple job that uses a tFileList to go through a directory on my local machine, then uses a tHDFSPut to upload the files onto my HDFS. I have made no changes since v6.0 where it used to work. Now I get the following error message after it has loaded the first file.....
Exception in component tHDFSPut_1
java.io.IOException: DataStreamer Exception:
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:796)
Caused by: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:101)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1752)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1530)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1483)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:668)
The UnresolvedAddressException seems strange since I have run the job perfectly well with the same settings in v6.0, and the 6.1 job was actually able to load a single file before I get this error. To show the structure of the job, I am including a screenshot. The tJava is just in place to log some stuff for me, it does nothing else.....
I have raised another question about a warning box I get when I try to load and run this job. It tells me I have a missing Jar (which is horrendously named ...hadoop-conf-_jW{with lots of other characters and numbers}.jar) and I have seen this raised a few times (with different characters and numbers, but the same "hadoop-conf"). I suspect that this might be something completely different, but put it here just in case.
****FAO Talend****
I have been using your tools for years and have grown to accept that there are small teething troubles between versions. That is fine. But I have had nothing but trouble with the BigData edition with EVERY version I have used. You need to ensure that...
1) The products are tested properly
2) There are decent tutorials that work
3) There is a consistency in your approach
4) That simple things work
5) That regression between versions is minimised
I always do my best to advocate Talend. I can advocate DI and ESB with no problems. There are niggling issues with both, but there are workarounds and the community are equipped with the knowledge to help people who are not aware of those issues. But with BigData I am pulling my hair out with the lack of information provided by Talend and the apparent lack of knowledge in the community. This needs resolving.