3 Replies Latest reply: May 5, 2017 11:17 AM by Martin Hamilton RSS

    Calculate Duration between 2 times

    Martin Hamilton

      Hi - I am trying to analyse the duration between starting and ending times. This is all around measuring the duration of a web session.

       

      I have loaded data and its created the following table which I want to work with:

       

      USER_SESSIONS:

      LOAD

          USER_ID,

          SESSION_ID,

          Year(Timestamp#(DATE_CREATED,'')) as [Session Year],

          Date(Floor(DATE_CREATED), 'DD/MM/YYYY') as [Session Date],

          Time(Time(DATE_CREATED,''),'hh:mm:ss') as [Session Time],

          MonthName(Floor(DATE_CREATED)) as [Session Month],

          Dual('Q' & Ceil(Month([DATE_CREATED])/3),QuarterStart([DATE_CREATED])) as [Session Quarter],

          WeekName(Timestamp#(DATE_CREATED,'')) as [Session Week],

          KEY_DATA,

          EVENT_TYPE,

          DESCRIPTION

       

      I am looking to take the first and the last record related to each SESSION_ID to then determine the duration into a differing table e.g.

       

      ID,Date,SessionStartTime,SessionEndTime,Duration

      -2ekF7TyywRVZRgMwAz9aXk, 31/01/2017, 11:24:32,11:34:32,00:10:00


      How would I go about doing this?

       

      If the columns were in the initial file I would probably just select the MIN and MAX of the session_id but not sure how I would do it when i am deriving the columns required as part of the load?

       

      thanks

       

      Martin



        • Re: Calculate Duration between 2 times
          Andrey Khoronenko

          Hi Martin,

           

          If your table contains a field of DATE_CREATED in format [MM/DD/YYYY hh:mm:ss], why do not you want to use the timestamp? For example

           

          USER_SESSIONS:

          LOAD

              USER_ID,

              SESSION_ID,

              Year(Timestamp#(DATE_CREATED,'')) as [Session Year],

              Date(Floor(DATE_CREATED), 'DD/MM/YYYY') as [Session Date],

              Time(Time(DATE_CREATED,''),'hh:mm:ss') as [Session Time],

              MonthName(Floor(DATE_CREATED)) as [Session Month],

              Dual('Q' & Ceil(Month([DATE_CREATED])/3),QuarterStart([DATE_CREATED])) as [Session Quarter],

              WeekName(Timestamp#(DATE_CREATED,'')) as [Session Week],

              Timestamp(DATE_CREATED) as [Session Timestamp],

              KEY_DATA,

              EVENT_TYPE,

              DESCRIPTION

          .....

           

          Then the beginning of the session will be Min([Session Timestamp]), the end of the session will be Max([Session Timestamp]),

          duration - Max([Session Timestamp]) - Min([Session Timestamp]).

           

          Regards,

          Andrey

            • Re: Calculate Duration between 2 times
              Martin Hamilton

              Hi Andrey

               

              I already do have the session time in the line:

               

                Time(Time(DATE_CREATED,''),'hh:mm:ss') as [Session Time]

               

              Do you suggest running another load across the same data source and selecting MIN and MAX records associated to each unique SESSION_ID?

               

              My data looks like the following:

               

                

              SESSION_IDSession Time
              -2ekF7TyywRVZRgMwAz9aXk11:24:32
              -2ekF7TyywRVZRgMwAz9aXk11:26:52
              -2ekF7TyywRVZRgMwAz9aXk11:30:24
              -2ekF7TyywRVZRgMwAz9aXk11:31:41
              -2ekF7TyywRVZRgMwAz9aXk11:32:54
              -2ekF7TyywRVZRgMwAz9aXk11:34:00
              -2ekF7TyywRVZRgMwAz9aXk11:34:32

               

              I still want to retain all other records in the load and not drop any.

               

              So ideally i would have another table that looks like the following:

                  

              SESSION_IDSession StartSession End Duration
              -2ekF7TyywRVZRgMwAz9aXk11:24:3211:34:4200:10:00
                • Re: Calculate Duration between 2 times
                  Martin Hamilton

                  Hi Andrey

                   

                  Thanks for making me think about this.

                   

                  I have actually come up with this which i think works, i just have to calculate the duration between the 2 times and i think i will be good to go.

                   

                  MIN_MAX:

                   

                   

                  LOAD

                    SESSION_ID,

                      SESSION_ID AS ID,

                      Year(Timestamp#(DATE_CREATED,'')) as [Year],

                      Date(Floor(DATE_CREATED), 'DD/MM/YYYY') as [Date],

                      Time(Time(DATE_CREATED,''),'hh:mm:ss') as [Time],

                      MonthName(Floor(DATE_CREATED)) as [Month],

                      Dual('Q' & Ceil(Month([DATE_CREATED])/3),QuarterStart([DATE_CREATED])) as [Quarter],

                      WeekName(Timestamp#(DATE_CREATED,'')) as [Week]

                    

                  FROM [lib://DataDirectory/eir_audit_log_2017.csv]

                  (txt, codepage is 1252, embedded labels, delimiter is ',', msq);

                   

                   

                  Right Join (MIN_MAX)

                  LOAD ID,

                    Max(Time(Time(Time,''))) as EndTime,

                    Min(Time(Time(Time,''))) as StartTime

                   

                  Resident MIN_MAX

                  Group by ID;