Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have a CSV file with raw data and I'm trying to load it into Hive table that uses the Parquet format. I found a way to do this but I was wondering if there is an easier way to do it which would only require 1 single job.
Here's how I did it:
- a Big Data Batch job which reads the CSV file from HDFS (tFileInputDelimited) and outputs it as a Parquet file (tFileOutputParquet)
- a Standard job with just the tHiveLoad component which reads the Parquet file and loads it into the Hive table
My question is: is there a way to do this in 1 single job?
Many thanks,
Axel
I tested it but I get an error "PartialGroupNameException Does not support partial group name resolution on Windows. Incorrect command line arguments."
Any clue what this means?
Hi,
If your Hive setup uses Kerberos authentication ... you must ensure it is correctly configured in Talend.
Best Regards.