Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I was debugging a Talend Data Integration jobs for days searching why sometimes my job faces some FK violations in the database. Today I realized that some Date fields are being stored with the time part of the date as "01:00:00" instead of "00:00:00".
I produced a very small example to illustrate what I think it is a bug:
tMSSQLInput ---> tJavaRow
tMSSQLInput (connected to any database, it doesn't really matter):
select convert(DATETIME, '2015-10-18') mydate union all select convert(DATETIME, '2015-10-19')
union all
select convert(DATETIME, '2015-10-20')
tJavaRow
System.out.println(input_row.mydate);
When I run this small job, I got the following result:
[statistics] connecting to socket on port 4031 [statistics] connected Sun Oct 18 01:00:00 BRST 2015 Mon Oct 19 00:00:00 BRST 2015 Tue Oct 20 00:00:00 BRST 2015 [statistics] disconnected
Note that the 2015-10-18 is with the time part as "01:00:00". Particularly, I can only reproduce this behavior with this magic date. Any other date, except Oct 18, 2015, set the time part as "00:00:00".
Do someone know why this happens? Can someone confirm this weird behavior is really a bug, or am I missing something?
I almost forgot to post this here, but here is the bug report and the result.
I found the problem, I pointed the solution and it was promptly merged into the code (they solved in 2 days! wow!).
No questions at all, no semantics discussion. A bug is just a bug.
Hi Diego,
on Sunday 2015-10-08 at 00:00:00 daylight saving time startet in Brazil and clocks were turned forward one hour to 01:00:00.
Best regards,
Thomas
I believe this is Java *thing* related to Brazilian Summer Time. Is it causing you a big issue?
Edit: I've just noticed that @ThWabi got here first and explained it better than I did.
You are right guys. I improved my example with:
select convert(DATETIME, '2015-10-16') mydate union all select convert(DATETIME, '2015-10-17') union all select convert(DATETIME, '2015-10-18') union all select convert(DATETIME, '2015-10-19') union all select convert(DATETIME, '2015-10-20')
and the result was:
[statistics] connecting to socket on port 3708 [statistics] connected Fri Oct 16 00:00:00 BRT 2015 Sat Oct 17 00:00:00 BRT 2015 Sun Oct 18 01:00:00 BRST 2015 Mon Oct 19 00:00:00 BRST 2015 Tue Oct 20 00:00:00 BRST 2015 [statistics] disconnected
That is, Java seems to shift the date to follow my system timezone.
But despite of knowing this, I believe we are dealing with a bug anyway. Let me explain.
If we keep in mind that a Date/Calendar object in Java represents a moment in time (not a date), then "2015-10-18 00:00 BRT" and "2015-10-18 01:00 BRST" belongs to the same moment, which means their internal representation are the same. Java knows that "2015-10-18 00:00 BRT" doesn't really exists, so they prefer to display the Date/Calendar object using the second representation. So far, so good.
But, if the internal representation of them are the same, why they insert different content to the database?
My guess is that Talend component is converting the Date/Calendar object to its string representation (or any other conversion that offset the date according to the timezone), which is causing this. Since my database is not using time zones at all, I got an inconsistent behavior.
Do you agree with me? Or am I missing something again?
Is it causing you a big issue?
Yeah, it is driving me nuts actually. All Date fields of my database has FK pointing to a master Calendar table, which tells which dates are allowed in the database (this kind of consistency check is needed because of a design issue). Since there are no correspondent key in the Calendar table, I get an error when I try to insert a record with a valid date, but different time. Of course I already solved the problem before posting here, using String/VARCHAR fields for Date instead of the Date itself, but this seems to be an odd solution for me. I prefer to check what else I can do, and maybe file a bug to Talend if needed.
What you guys think about my concerns? Should I file a bug?
Is there another way to solve the problem without having to deal with Date as String?
This isn't a bug with Talend. This is a Java feature to handle the time shift. Midnight on the 18th technically never exists as the clocks are shifted forward to 1am.
https://www.timeanddate.com/time/change/brazil/sao-paulo?year=2015
It seems you did not take my concerns in consideration.
Despite being a Java thing or not, Talend is striping time zone information when it shouldn't. This is leading to an error.
Talend doesn't explicit a time zone (and don't give an option to do so). Since it doesn't explicit a time zone, it is up to Java to decide what to use to represent the date, so it follows system settings.
Note that, because of this, I can't really use DateTime type to deal with dates without time zones, because the results will always be inconsistent. In a job where both sides don't deal with time zones at all, data will flow through Talend and get messed when they hit the start of Daylight Saving Time. Better yet, this may happen or not depending of the settings of the system where Talend is running! This is why I think this is a bug.
OK, I see what you are saying. I do not believe that Talend is converting to or from a String type (unless you are implicitly converting using a tMap, for example). It would seem silly to do this when all Talend does is generate Java code. However, it doesn't make it easy for you to play around with timezones, I agree. What you can do to mitigate this is to use some Java with a tJavaRow to manipulate your Dates with specific timezone requirements.
While this is not ideal for you requirements, I still don't believe this is a bug. But maybe it is a good a feature request.
Sorry, but I am still not convinced.
In my job, I have a simple tMSSQLInput connected to tMSSQLOutput. No tMap, no code, nothing special.
I am describing a situation where the data read from a DATETIME column differs from the written to another DATETIME column. And we detected that this happens because of system settings, and there is no way to avoid this without writing custom code. Do you really believe this is OK and it is not a bug?
Also, note that I am not trying to find a workaround. I already solved this problem in particular.