Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Talend date problem with Parquet and Impala

I'm having serious issues with writing a date in the correct format to a parquet file. Here's the situation:

 

I read an xml file with a date as follows: "2019-09-24". I set the type to DATE in tMap and it has the correct format: "yyyy-MM-dd". I output this in the same way to a parquet file. The OutputParquet component has the same data types as Talend so I'm assuming it's saved in the same format.

 

The problem is when I try to read this file as an external table from Imapala. When specifying the DDL I have to use a TIMESTAMP as Impala doesn't support DATE as a datatype. I've tried every format imaginable, on both source and target, but I can't get it to work. I've even tried setting all fields to STRING and it still won't allow me to read the file.

 

Error message is something like this: 

"0.parquet' has an incompatible Parquet schema for column 'stage.cdc_current_pure_application.submission_date'. Column type: TIMESTAMP, Parquet schema: optional float budget_diff [i:16 d:1 r:0]"

 

Can somebody please tell me how to correctly store a DATE in a parquet file and how to correctly read it as an external table from Impala? 

 

Thanks

 

 

Labels (3)
1 Reply
Anonymous
Not applicable
Author

Hi,

 

   I could see that this post is a duplicate of the below post

 

https://community.talend.com/t5/Design-and-Development/Parquet-file-DATE-and-Impala/m-p/158307#M9666...

 

   Could you please close this post as we are already pursuing for answer using the other post?

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂