2 Replies Latest reply: Aug 7, 2014 1:32 PM by Steve Lord RSS

    Best key/join approach for this scenario?

    Steve Lord

      Hi, What is the best approach for below, just 'outer join load' maybe, or some other approach?  I have tables like this:



      Load EventId,



      From EventInfo.qvd



      Load UserId,




      From EventSignups.qvd



      Load UserId,






      From EventAttendance.qvd


      Now, the first two tables look like they should take care of themselves since they have only one key between them (EventId).  Trick to the next two is that a person can signup for an event, but maybe change their mind for whatever reason and go as a walk-in to a different event id.  So we might see they signed up for event 123, but have no attendance information there, while they have attendance info for event 456 and no signup data for that.  Signup but no attend would be counted as 'no shows' and attend without signup would be counted as 'walk ins', but I think I can take care of those with a couple of if this and that statements on a resident load after the second and third table are joined...Or something.


      Any clean unambiguous approach for this data?

        • Re: Best key/join approach for this scenario?
          Evan Kurowski

          EventSignups and EventAttendance seem to be two facets of the same set of fact data, which when you break it down are both "Attendee actions".


          Concatenate all rows from Signups and Attendance into a single table called [Event_Facts] and mark every row with either a value of 'Signup' or 'Attend' in a new field called [AttendeeAction].  Then drop the tables [EventSignups] and [EventAttendance].


          Then going forward, there is a clean association between EventInfo and Event_Facts.  Picking a single EventID will associate with all the UserIDs of any type of action recorded for that event, and adding additional [AttendeeAction] values like 'Payment' or 'Inquired' or 'Cancelled' can also slot into this system without complicating the data model with new tables.  Going forward you can add any number of additional facts to record about each EventID in a system like this.


          If you want to isolate the attendance information from the signup information, use the set-analysis segments [AttendeeAction]={'Signup'} vs. [AttendeeAction]={'Attend'} to separate any one set of fact information from the rest when crafting expressions.


            • Re: Best key/join approach for this scenario?
              Steve Lord

              This is a well thought out response and I agree with the distinction between signup and attendance.  The new field to globally clarify that distinction between records, and merging the data to one table with that field in mind, is a nice approach.  **Specifically, your approach neutralizes any concerns about synthetic keys or loops or other wonky stuff that might form if I combine this data by some other approach.***  I will leave it to myself (or subsequent qlikcommunity posts!) to hammer out the specific equations for subsequent analysis if I get stuck later.  (There's a shopping list of requested information I need to develop equations to produce the answers to.)


              Two branches on the same tree, some analysis to see which leaves the two branches do or don't have in common...