Usage of Thiveinput along with TSqlRow

Anonymous · ‎2018-04-09

I am new to using Big Data Batch jobs but i have previously worked on standard jobs for developing big data solutions. I am currently trying to build a spark job which uses a Thiveinput (Hive Table) as source and write to Thiveoutput (Hive Table). I am getting the below mentioned issues :

[WARN ]: org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
org.apache.spark.sql.catalyst.parser.ParseException:
no viable alternative at input '<EOF>'(line 1, pos 4)

== SQL ==
USE
----^^^

at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:197)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:99)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:46)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:53)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682)
at kai.dummy_0_1.Dummy.tHiveInput_1Process(Dummy.java:1034)
at kai.dummy_0_1.Dummy.run(Dummy.java:1432)
at kai.dummy_0_1.Dummy.runJobInTOS(Dummy.java:1304)
at kai.dummy_0_1.Dummy.main(Dummy.java:1194)

I have developed a similar job with Parquet as input and written the data into another Parquet file using tsqlRow but i am unable to replicate the same with the hive components.

Adding a screenshot of the jobs below ( Just the Thiveinput & Tlogrow )

Thanks for the help in advance.

Aditya

Anonymous · ‎2018-04-13

Hello,

So far, tHiveRow and tHiveCreateTable components will be available in Standard Job not bigdata batch job.

If you use Parquet components, you can already partition with Spark and easily combined with Hive DI createTable (declaring it as an external table).

Best regards

Sabrina

Big Data

Java

Talend Data Integration

v6.x