: org.apache.hadoop.hive.ql.exec.Utilities - Processing alias <topic>
: org.apache.hadoop.hive.ql.exec.Utilities - Adding input file hdfs://clustername/<some_directory>/<topic>/day=2015-02-09/time=00-00
: org.apache.hadoop.hive.ql.exec.Utilities - Content Summary not cached for hdfs://clustername/<some_directory>/<topic>/day=2015-02-09/time=00-00
java.lang.IllegalArgumentException: java.net.UnknownHostException: clustername
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:231)
I'm leveraging Talend Big Data 5.6 on Win 7. Everything seems normal before this - the flow connects to hcat (for metadata) and hdfs (for temp directory) fine... the flow bombs when hive gives back the cluster name vs one of the name nodes.
The connection mode was Embedded. I was able to get the flow to work properly by switching to Standalone and Hive 2. This doesn't necessarily directly solve the HA question, but it works.