2 Replies Latest reply: Feb 10, 2016 2:45 PM by Oliver Krause RSS

    Ignore incorrect AND Repair missing data on loading

    Oliver Krause

      Hi all,

      I readin a database table similar to this

      LogID

      LoggerID

      Timestamp

      ActiveState

      01

      A

      01.01.2016 15:11:11

      Active

      01

      A

      01.01.2016 15:33:11

      Inactive

      01

      A

      01.01.2016 15:47:11

      Active

      01

      A

      01.01.2016 18:33:11

      Inactive

      02

      A

      02.01.2016 07:33:11

      Active

      02

      A

      02.01.2016 15:13:11

      Inactive

      03

      A

      03.01.2016 08:33:11

      Active

      The LogID indicates a unique ID of the recording id.

       

      In a different table I can lookup the start and end of this recording:

      LoggerID

      LogID

      LogEnd

      LogStart

      A

      01

      01.01.2016 19:31:11

      01.01.2016 15:31:11

      A

      02

      02.01.2016 17:31:11

      02.01.2016 06:31:11

       

      A cycle is defined as Active-message to Inactive-message, if the time system down-time before (inactive-active) is >1min and the active-time (active-inactive) is >2min.

       

      I then need to

      • count the number of Cycles (Active-Inactive) per loggerID per day
      • Sum up the total time of these cycles (between active-inactive) per loggerID per day

       

      There are 2 exceptions I need to handle as data quality is not always good:

      • Sometimes I have no message "Inactive" at the end of the recording, so it might look like this:

      LogID

      LoggerID

      Timestamp

      ActiveState

      01

      A

      01.01.2016 15:11:11

      Active

      01

      A

      01.01.2016 15:33:11

      Inactive

      01

      A

      01.01.2016 15:47:11

      Active

      02

      A

      02.01.2016 07:33:11

      Active

      Recording 01 has no "inactive" state at the end. I would like to assume that in this case the end-of-recording 01 from second table is the InActive-time (01.01.2016 19:31:11).

       

      Sometimes also the recording device creating this DB records looses sync and creates 100 additional false Active-Inactive-records in one second…

      LogID

      LoggerID

      Timestamp

      ActiveState

      01

      A

      01.01.2016 15:11:11

      Active

      01

      A

      01.01.2016 15:33:11

      Inactive

      01

      A

      01.01.2016 15:47:11

      Active

      01

      A

      01.01.2016 18:33:11

      Inactive

      02

      A

      02.01.2016 07:33:11

      Active

      02

      A

      02.01.2016 15:13:11

      Inactive

      03

      A

      03.01.2016 08:20:11

      Active

      03

      A

      03.01.2016 08:33:11

      Inactive

      03

      A

      03.01.2016 08:33:13

      Active

      03

      A

      03.01.2016 08:33:13

      Inactive

      03

      A

      03.01.2016 08:33:16

      Active

      03

      A

      03.01.2016 08:33:16

      Inactive

      03

      A

      03.01.2016 08:33:17

      Active

      03

      A

      03.01.2016 08:33:19

      Inactive

      03

      A

      03.01.2016 08:34:11

      Active

      03

      A

      03.01.2016 08:34:21

      Inactive

      03

      A

      03.01.2016 08:36:11

      Active

      03

      A

      03.01.2016 08:38:21

      Inactive

       

      Means, the green ones need to be counted and the red ones ignored…

       

      I would need to ignore all cycles which do not meet the cycle criteria (A cycle is defined as Active-message to Inactive-message, if the time system down-time before (inactive-active) is >1min and the active-time (active-inactive) is >2min.).

       

      Below you find a qvs with example data...

       


      LogFiles:
      LOAD * INLINE [
      LoggerID, LogID, LogStart, LogEnd
      A,01, 01.01.2016 15:31:11,01.01.2016 19:31:11,
      A,02, 02.01.2016 06:31:11,02.01.2016 17:31:11,
      ]
      ;

      Messages:
      LOAD * INLINE [
      Timestamp, ActiveState, LoggerID, LogID
      01.01.2016 15:11:11, Active, A, 01,
      01.01.2016 15:33:11, Inactive, A, 01,
      01.01.2016 15:47:11, Active, A, 01,
      01.01.2016 18:33:11, Inactive, A, 01,
      02.01.2016 07:33:11, Active, A, 02,
      02.01.2016 15:13:11, Inactive, A, 02,
      03.01.2016 08:20:11, Active, A, 03,
      03.01.2016 08:33:11, Inactive, A, 03,
      03.01.2016 08:33:11, Active, A, 03,
      03.01.2016 08:33:11, Inactive, A, 03,
      03.01.2016 08:33:12, Active, A, 03,
      03.01.2016 08:33:13, Inactive, A, 03,
      03.01.2016 08:33:13, Active, A, 03,
      03.01.2016 08:33:13, Inactive, A, 03,
      03.01.2016 08:33:13, Active, A, 03,
      03.01.2016 08:33:13, Inactive, A, 03,
      03.01.2016 08:33:14, Active, A, 03,
      03.01.2016 08:33:16, Inactive, A, 03,
      03.01.2016 08:33:16, Active, A, 03,
      03.01.2016 08:33:16, Inactive, A, 03,
      03.01.2016 08:33:17, Active, A, 03,
      03.01.2016 08:33:19, Inactive, A, 03,
      03.01.2016 08:34:11, Active, A, 03,
      03.01.2016 08:34:21, Inactive, A, 03,
      03.01.2016 08:36:11, Active, A, 03,
      03.01.2016 08:38:21, Inactive, A, 03,
      ]
      ;

       

      Finally I "only" need to

      • count the number of Cycles (Active-Inactive) per loggerID per day
      • Sum up the total time of these cycles (between active-inactive) per loggerID per day

      with the good data - "repairing" the bad as described above...

       

      Thanks for your help

       

      Oliver