Re: Load a CSV file into Hive Parquet table - Qlik Community

Anonymous · ‎2019-01-03

Hello,

I have a CSV file with raw data and I'm trying to load it into Hive table that uses the Parquet format. I found a way to do this but I was wondering if there is an easier way to do it which would only require 1 single job.

Here's how I did it:

- a Big Data Batch job which reads the CSV file from HDFS (tFileInputDelimited) and outputs it as a Parquet file (tFileOutputParquet)

- a Standard job with just the tHiveLoad component which reads the Parquet file and loads it into the Hive table

My question is: is there a way to do this in 1 single job?

Many thanks,

Axel

vapukov · ‎2019-01-03

Hi Axel

what wrong with tHiveOutput ?

regards, Vlad

Anonymous · ‎2019-01-03

Hi Vlad, thanks for your reply. Are you saying that it should work fine if I connect tFileInputDelimited to tHiveOutput if I want the Hive table in Parquet format? Sorry, I'm fairly new to Talend.

vapukov · ‎2019-01-03

Why just not test?

It support parquet format

Anonymous · ‎2019-01-03

I tested it but I get an error "PartialGroupNameException Does not support partial group name resolution on Windows. Incorrect command line arguments."

Any clue what this means?

chabou19 · ‎2025-02-28

Hi,

If your Hive setup uses Kerberos authentication ... you must ensure it is correctly configured in Talend.

Best Regards.

Load a CSV file into Hive Parquet table

Big Data

Other

v7.x