2 Replies Latest reply: Sep 15, 2017 1:06 PM by Kaushik Solanki RSS

    Convert start/stop to list of days

    Russell VanBlon

      Hi everyone,

       

      I've spent some time searching online for a solution to this but most of the examples I run into use dates/times to create ranges and it doesn't seem to fit our use case.

       

      I have data stored in a database in the format below describing when a user performs an "action".

       

      I need to re-format this data to show which users performed an action each day.  The total number of actions per user do not matter, I just need to know active/inactive (0/1) for each user.

       

      Source Data:

      User IdAction Start DayAction End Day
      'A'11
      'B'34
      'C'55
      'A'13

       

      SourceData:

      LOAD * Inline [

      User, StartDay, EndDay

        'A', 1, 1

        'B', 3, 4

        'C', 5, 5

        'A', 1, 3

      ];

       

      The data represents "users" and the days they performed an action.

       

      For example:

      User 'A' started an action on day 1, then finished on day 1.

      User 'B' started an action on day 3, then finished on day 4.

      User 'C' started an action on day 5, then finished on day 5.

      User 'A' started another action on day 1, then finished that action on day 3. (actions can span days and overlap with other actions)

       

      User 'A' performed 2 actions on day 1, but we only care if any action was performed that day or not.  We don't need the total number of actions.  The data could have "overlaps" in multiple rows, but we don't care about the total.

       

      I would like to convert this to the following format showing "buckets" representing days each user performed an action: (if you have a better suggestion to format the data, I am open to ideas)

       

      Desired output data to visualize in QlikSense:

      UserDay 1Day 2Day 3Day 4Day 5Day 6
      'A'111000
      'B'001100
      'C'000010
      Total112110

       

      The goal of the output table is to create a "Total" of all active users per day, so we can visualize that total for each day by totaling each Day.  Note that User 'A' was performed 2 actions on day 1 in the source table, but each user is only counted once in the output table.

       

      Note: The example above is simplified.  There are 100+ days in our output table, so hopefully there's a way to do this without assigning each day in the load script.

       

      For example:

      On day 1: 1 user was active

      On day 2: 1 user was active

      On day 3, 2 users were active

      etc.....

       

      I'm not married to the data formats above, although the first table represents an over-simplified version of how our data is currently stored in the database.  We can change that format if needed.

       

      Any ideas/suggestions on how to translate the first table into the 2nd table?  Thanks in advance!