Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have tschemacompliancecheck component in my job to validate incoming source file. As part of the process, the rejected data is captured and sent back to the business user to fix it. Clean data is loaded to target table.
My issue is with Date formats, this component doesn't detect errors in dates when it has "5 digits" in the year portion of the date.
For instance, below is my source data:
SKU|TYPE|SALE_FROM|SALE_TO
209312|text|
20181-07-26|20221-08-30
211448|text|
20182-11-01|20231-12-31
211896|text|
20183-10-26|20241-09-24
Here is the snippet of the code:
I even checked the code that's been generated to make sure its there as well:
However, this data is being loaded to target table rather than getting rejected:
This doesn't make any sense to me. Can someone please guide? Is there any limitation to isDate function?
Hi
Which version of studio are you using? I just made a simple testing and I can't reproduce the issue, see
Note, the data type of source data is string.
Regards
Shong
Try 20221-04-12. I am 100% sure it would let it through.
20221-4-12 was rejected because tschemacompliancecheck doesn't take single digit month (m/4, expects 04).
I read through the java code for IsDate routine, and it basically uses SimpleDateFormat behind the scenes. I feel that SimpleDateFormat converts any 5 digits date to the closest date within a century.
https://docs.oracle.com/javase/7/docs/api/java/text/DateFormat.html#parse%28java.lang.String%29
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html
But what surprises me is that this component's sole purpose is to check the integrity of data. I wish Talend would use something robust rather than SimpleDateFormat to do this check. Again, this is based on what I read.
I am using Talend 7.0.1 version.
@Sneha Yalamarty , i tried again and I confirmed the issue, can you please report a jira issue under Talend DI component project on our bugtracker?
Regards
Shong