Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Dear All,
i am calculating a time difference between two times. i have found in some cases time difference is about 20 ms only but in number format its contain different demension and in time format it shows 20ms only.
below is the sample data, i have also tried using interval() and other time() timestamp() etc function but getting same output
even i have tried by converting ordtimestamp by 3 digit point also.
tbl_orders:
LOAD * INLINE [
ORDKEY, ORDTIMESTAMP, TRADETIMESTAMP
2014-03-05/512361/23000030323397, 2014-03-05 15:02:03.208001, 2014-03-05 15:02:03.228
2014-03-05/512417/19000137426159, 2014-03-05 14:47:20.619389, 2014-03-05 14:47:20.639
2014-03-05/500171/15000108205982, 2014-03-05 15:01:13.441897, 2014-03-05 15:01:13.461
2014-03-05/500285/14000004008996, 2014-03-05 09:30:56.345519, 2014-03-05 09:30:56.365
2014-03-05/500285/20000002299288, 2014-03-05 12:45:17.366308, 2014-03-05 12:45:17.386
];
tbl_new:
load * , TRADETIMESTAMP-ORDTIMESTAMP as diff Resident tbl_orders;
DROP Table tbl_orders;
But when i convert both timestamp as below then also getting the same.
LOAD * INLINE [
ORDKEY, ORDTIMESTAMP, TRADETIMESTAMP
2014-03-05/512361/23000030323397, 2014-03-05 15:02:03.208, 2014-03-05 15:02:03.228
2014-03-05/512417/19000137426159, 2014-03-05 14:47:20.619, 2014-03-05 14:47:20.639
2014-03-05/500171/15000108205982, 2014-03-05 15:01:13.441, 2014-03-05 15:01:13.461
2014-03-05/500285/14000004008996, 2014-03-05 09:30:56.345, 2014-03-05 09:30:56.365
2014-03-05/500285/20000002299288, 2014-03-05 12:45:17.366, 2014-03-05 12:45:17.386
];
What should i do then, because on this result i have to apply below intervalMatch().
and after applying same i am found some of the records not falling in any criteria
TIMECAT:
LOAD * Inline [
FROM_TIME, TO_TIME, TIME_CAT
'00:00:00.000', '00:00:00.020', '10-20ms'
'00:00:00.021', '00:00:00.050', '21-50ms'
'00:00:00.051', '00:00:00.100', '51-100ms'
'00:00:00.101', '00:00:00.200', '101-200ms'
'00:00:00.201', '00:00:00.500', '201-500ms'
'00:00:00.501', '00:00:01.000', '501-1000ms'
'00:00:01.001', '00:00:02.000', '1-2sec'
'00:00:02.001', '00:00:05.000', '2-5sec'
'00:00:05.001', '00:00:10.000', '5-10sec'
'00:00:10.001', '00:00:20.000', '10-20sec'
'00:00:20.001', '00:00:50.000', '20-50sec'
'00:00:50.001', '00:01:00.000', '50-60sec'
'00:01:00.001', '16:00:00.000', '1Min+'
];
It's because of binary floating point representation. I suggest to read this blog: Rounding Errors
What you are seeing here Jitendra is the effect of date/time values being stored as floating point values. Even though both values are represented the same to 20ms behind the scenes they are slightly different, at an even lower level of granularity.
If you add a round statement into the load this should make things better, for instance:
date(round(timestamp#(TRADETIMESTAMP,'YYYY-MM-DD hh:mm:ss.ffffff'), 0.00001), 'YYYY-MM-DD hh:mm:ss.ffffff')
Or you could just take a more limited number of characters from the string:
date(timestamp#(left('TRADETIMESTAMP', 19),'YYYY-MM-DD hh:mm:ss'), 'YYYY-MM-DD hh:mm:ss.ffffff')
Both approaches reduce the granularity so that floating point errors are no longer an error.
Hope that helps,
Hi Steve,
i have check this it is useful in some cases but not fulfilling in other cases
Hi,
maybe this could be a solution:
tbl_orders:
LOAD ORDKEY,
Timestamp(Round(ORDTIMESTAMP,1/86400000),'YYYY-MM-DD hh:mm:ss.fff') as ORDTIMESTAMP,
Timestamp(Round(TRADETIMESTAMP,1/86400000),'YYYY-MM-DD hh:mm:ss.fff') as TRADETIMESTAMP,
Interval(Round(TRADETIMESTAMP-ORDTIMESTAMP,1/86400000),'hh:mm:ss.fff') as diff
INLINE [
ORDKEY, ORDTIMESTAMP, TRADETIMESTAMP
2014-03-05/512361/23000030323397, 2014-03-05 15:02:03.208001, 2014-03-05 15:02:03.228
2014-03-05/512417/19000137426159, 2014-03-05 14:47:20.619389, 2014-03-05 14:47:20.639
2014-03-05/500171/15000108205982, 2014-03-05 15:01:13.441897, 2014-03-05 15:01:13.461
2014-03-05/500285/14000004008996, 2014-03-05 09:30:56.345519, 2014-03-05 09:30:56.365
2014-03-05/500285/20000002299288, 2014-03-05 12:45:17.366308, 2014-03-05 12:45:17.386
];
TIMECAT:
LOAD Interval(Round(FROM_TIME,1/86400000),'hh:mm:ss.fff') as FROM_TIME,
Interval(Round(TO_TIME,1/86400000),'hh:mm:ss.fff') as TO_TIME,
TIME_CAT
Inline [
FROM_TIME, TO_TIME, TIME_CAT
'00:00:00.000', '00:00:00.020', '10-20ms'
'00:00:00.021', '00:00:00.050', '21-50ms'
'00:00:00.051', '00:00:00.100', '51-100ms'
'00:00:00.101', '00:00:00.200', '101-200ms'
'00:00:00.201', '00:00:00.500', '201-500ms'
'00:00:00.501', '00:00:01.000', '501-1000ms'
'00:00:01.001', '00:00:02.000', '1-2sec'
'00:00:02.001', '00:00:05.000', '2-5sec'
'00:00:05.001', '00:00:10.000', '5-10sec'
'00:00:10.001', '00:00:20.000', '10-20sec'
'00:00:20.001', '00:00:50.000', '20-50sec'
'00:00:50.001', '00:01:00.000', '50-60sec'
'00:01:00.001', '16:00:00.000', '1Min+'
];
IntervalMatch (diff)
LOAD FROM_TIME, TO_TIME
Resident TIMECAT;
hope this helps
regards
Marco
Thanks,
i will check and let you know
Dear Marco Wedel,
Thanks a lot. its working as per our need. Thanks a lot.
Regards,
Jitendra