Skip to main content
Announcements
SYSTEM MAINTENANCE: Thurs., Sept. 19, 1 AM ET, Platform will be unavailable for approx. 60 minutes.
cancel
Showing results for 
Search instead for 
Did you mean: 
Diego_Queiroz
Contributor III
Contributor III

Inconsistent time part in DATE field: bug or what?

I was debugging a Talend Data Integration jobs for days searching why sometimes my job faces some FK violations in the database. Today I realized that some Date fields are being stored with the time part of the date as "01:00:00" instead of "00:00:00".

 

I produced a very small example to illustrate what I think it is a bug:

tMSSQLInput ---> tJavaRow

 

tMSSQLInput (connected to any database, it doesn't really matter):

 

select convert(DATETIME, '2015-10-18') mydate
union all
select convert(DATETIME, '2015-10-19')
union all
select convert(DATETIME, '2015-10-20')

 

tJavaRow

System.out.println(input_row.mydate);

When I run this small job, I got the following result:

[statistics] connecting to socket on port 4031
[statistics] connected
Sun Oct 18 01:00:00 BRST 2015
Mon Oct 19 00:00:00 BRST 2015
Tue Oct 20 00:00:00 BRST 2015
[statistics] disconnected

Note that the 2015-10-18 is with the time part as "01:00:00". Particularly, I can only reproduce this behavior with this magic date. Any other date, except Oct 18, 2015, set the time part as "00:00:00".

 

Do someone know why this happens? Can someone confirm this weird behavior is really a bug, or am I missing something?

Labels (3)
25 Replies
Anonymous
Not applicable

You will be communicating with Salesforce using SOAP I assume. Salesforce may well store dates in UTC (for good reasons), but it only communicates the date and not the timezone. Java (not Talend) interprets the Date using a timezone. This is actually quite useful in more cases than it is not AND you can code around it if it becomes an issue.  

 

My point was that this is NOT a Talend *bug* since there are no code defects, and the functionality works precisely as an experienced Java developer using a Java code generation tool would expect. You can also configure the system to mitigate for any issues you find with the standard Java handling of dates. 

Diego_Queiroz
Contributor III
Contributor III
Author

It is a defect when it deals with other systems.

The way you talk, you expect that every Talend user be a Java programmer. Despite being a code generator, there should be no real expectancy that the user should be aware of the generated code. It should handle by itself how a system connect to the other, not the user.

If it allows me to read a system that is not aware of timezones, and use that data to write to another system that is also not aware of timezones, it should handle it nicely, not enforce me to understand that the generate code will store that piece of information in a structure that will mess with the data in the process.

Anonymous
Not applicable

Raise a feature request and you may be listened to. Raise a defect and it will quite rightly be closed immediately.  #NotABug

d416
Contributor
Contributor


@rhall wrote:

You will be communicating with Salesforce using SOAP I assume.

 via Talend's tSalesforce* components, but this issue happens with any Talend component.

 

Salesforce may well store dates in UTC (for good reasons), but it only communicates the date and not the timezone. Java (not Talend) interprets the Date using a timezone. This is actually quite useful in more cases than it is not AND you can code around it if it becomes an issue.  

The beauty of Talend is that a person can just drag/drop fields in a tMap with a schema that has been automatically determined by the tool - any unnecessary coding on top of that takes away from this 'just works' experience.

 

But beyond all that, no transformations should be applied to ANY data unless a job is explicitly written to transform it - doesn't matter which underlying framework is causing it, this is just a basic expectation of behavior.    In my specific case, the problem with 'coding around it' is that there could literally be dozens of Date fields on a single object in Salesforce.  Each object has 4 date fields for audit (last modified, last viewed, created, etc.) Our own org has about 200 different objects, so that's a lot of custom coding on each Date field, just to keep the date the same as it originates in the source data.
 

My point was that this is NOT a Talend *bug* since there are no code defects, and the functionality works precisely as an experienced Java developer using a Java code generation tool would expect. You can also configure the system to mitigate for any issues you find with the standard Java handling of dates. 


Semantics, but sure.. call it a 'User Experience Gap'... I will log the ticket... future users will be thankful for the labor we have put into getting to the root cause of this issue.

 

Anonymous
Not applicable

The tSalesforce components use SOAP requests.

 

Talend is a developer tool. It uses Java. You cannot be an effective Talend developer unless you know at least some Java. You can do basic stuff, but to do anything of any substance you need to understand how Java works.

 

Debugging requires a knowledge of Java. What is a NullPointerException? What causes it?
All expressions in Talend are Java. It might not look like it, but you are writing Java with every expression you write.
Talend Routines are Java classes. You can copy and paste your *Routine* code into a java file and compile it as normal Java.

If you want to make use of 3rd party Java APIs you can do this easily.....by using bog standard, out of the box Java.
When you compile and run, you are compiling and running a Jar in a Java virtual machine.

 

I'm really not sure how anyone can contest that this is a Java development tool which requires an understanding of Java to use it effectively. 

 

Regarding the extra coding required for Dates. You cannot really do without a bit of extra coding when dealing with Dates in ANY language if you are going against the trend of that language. Dates are intrinsically complicated. Java handles that by using timezones. For most scenarios, it works very well "out of the box". 

 

Regarding Date transformation happening implicitly, you are using Dates incorrectly and unwittingly causing that transformation. Use UTC across the board and you will be fine. Alternatively add some of the code recommendations I have made. In your scenario with many Salesforce dates, it would be prudent to ensure your JVM is set to UTC. Configuration of your systems is paramount and is Data Integration 101. 

 

The difference between a bug and a feature is not semantics. You are comparing apples with oranges. What you are suggesting is that because Talend haven't coded around accepted and understood Java default functionality, that is somehow a defect. Talend have never stated that they have changed the underlying program language's functionality. They have never stated that Dates would work in any other way than the way they work. If you understand how the product works and understand Java, you can quite easily mitigate for the most complicated Date scenarios. It is therefore not a defect or a bug. However, it could be considered a reasonable feature request to make configuring Dates easier for those who cannot make use of the functionality that is available. But any feature to make it easier for your requirement, will have to be implemented very carefully or it will cause havoc for the majority of cases where the default Java functionality works perfectly well. Would people making use of the default functionality be expected to start learning where Talend goes against Java defaults and where it conforms to them? 

 

Diego_Queiroz
Contributor III
Contributor III
Author

I almost forgot to post this here, but here is the bug report and the result.

 

 

I found the problem, I pointed the solution and it was promptly merged into the code (they solved in 2 days! wow!).

No questions at all, no semantics discussion. A bug is just a bug.