Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm having serious issues with writing a date in the correct format to a parquet file. Here's the situation:
I read an xml file with a date as follows: "2019-09-24". I set the type to DATE in tMap and it has the correct format: "yyyy-MM-dd". I output this in the same way to a parquet file. The OutputParquet component has the same data types as Talend so I'm assuming it's saved in the same format.
The problem is when I try to read this file as an external table from Imapala. When specifying the DDL I have to use a TIMESTAMP as Impala doesn't support DATE as a datatype. I've tried every format imaginable, on both source and target, but I can't get it to work. I've even tried setting all fields to STRING and it still won't allow me to read the file.
Error message is something like this:
"0.parquet' has an incompatible Parquet schema for column 'stage.cdc_current_pure_application.submission_date'. Column type: TIMESTAMP, Parquet schema: optional float budget_diff [i:16 d:1 r:0]"
Can somebody please tell me how to correctly store a DATE in a parquet file and how to correctly read it as an external table from Impala?
Thanks
Hi,
I could see that this post is a duplicate of the below post
Could you please close this post as we are already pursuing for answer using the other post?
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂