Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi there,
I am a support engineer from Snowflake. I would like to get in touch with Talend support team with regards to a problem that a customer is having when ingesting CSV file that contains date (in the format of "YYYY-MM-DD") earlier than 1970-01-01 (Unix epoch) into Snowflake via Talend Studio (7.1.1.20201230_0510-patch).
Example of a few rows of the CSV file:
1234567890|1980-02-19
1234567891|1969-11-23
While the date in the CSV file is in string format, it looks like Talend Studio converts the date column data into Unix epoch (in miliseconds) before staging and inserting the data into Snowflake. We were able to confirm this by intercepting the temp file generated by Studio (seen in the log output when running a job in Talend Studio). The Unix epoch conversion is the problem as Snowflake treats date/time values (in Unix epoch integer) that are less than 31536000000 (the number of milliseconds in a year) as a number of seconds, not miliseconds.
Using the 2nd row of the example above, Talend Studio would convert the date to -3369600000 (in miliseconds precision). The converted data is then staged into Snowflake and inserted into a target table. However, Snowflake regards any number <
31536000000 as seconds, so when the row is inserted, it is interpreted as 1863-03-23 instead.
See documentation here for reference: https://docs.snowflake.com/en/user-guide/date-time-input-output.html# auto-detection-of-integer-stored-date-time-and-timestamp-values
DATE and TIMESTAMP:
Note that the problem does not happen consistently, but we have a recording of it happening.
When attempting to ingest the same CSV directly using Snowflake (either via a connector or Snowsql client), such issue does not arise (as there is no Unix epoch conversion happening). Can someone please get in touch with me, as I am unable to raise a support case without a product code?
Regards,
Kenny Lim