Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
stevelord
Specialist
Specialist

Best key/join approach for this scenario?

Hi, What is the best approach for below, just 'outer join load' maybe, or some other approach?  I have tables like this:

EventInfo:

Load EventId,

EventDate,

EventLocation

From EventInfo.qvd

EventSignups:

Load UserId,

EventId,

SlotId,

SlotTime

From EventSignups.qvd

EventAttendance:

Load UserId,

EventId,

SlotId,

IsEventAttended,

IsActivityAAttended,

IsActivityBAttended

From EventAttendance.qvd

Now, the first two tables look like they should take care of themselves since they have only one key between them (EventId).  Trick to the next two is that a person can signup for an event, but maybe change their mind for whatever reason and go as a walk-in to a different event id.  So we might see they signed up for event 123, but have no attendance information there, while they have attendance info for event 456 and no signup data for that.  Signup but no attend would be counted as 'no shows' and attend without signup would be counted as 'walk ins', but I think I can take care of those with a couple of if this and that statements on a resident load after the second and third table are joined...Or something.

Any clean unambiguous approach for this data?

1 Solution

Accepted Solutions
evan_kurowski
Specialist
Specialist

EventSignups and EventAttendance seem to be two facets of the same set of fact data, which when you break it down are both "Attendee actions".

Concatenate all rows from Signups and Attendance into a single table called [Event_Facts] and mark every row with either a value of 'Signup' or 'Attend' in a new field called [AttendeeAction].  Then drop the tables [EventSignups] and [EventAttendance].

Then going forward, there is a clean association between EventInfo and Event_Facts.  Picking a single EventID will associate with all the UserIDs of any type of action recorded for that event, and adding additional [AttendeeAction] values like 'Payment' or 'Inquired' or 'Cancelled' can also slot into this system without complicating the data model with new tables.  Going forward you can add any number of additional facts to record about each EventID in a system like this.

If you want to isolate the attendance information from the signup information, use the set-analysis segments [AttendeeAction]={'Signup'} vs. [AttendeeAction]={'Attend'} to separate any one set of fact information from the rest when crafting expressions.

20140805_Event_Tracking_data_model.png

View solution in original post

2 Replies
evan_kurowski
Specialist
Specialist

EventSignups and EventAttendance seem to be two facets of the same set of fact data, which when you break it down are both "Attendee actions".

Concatenate all rows from Signups and Attendance into a single table called [Event_Facts] and mark every row with either a value of 'Signup' or 'Attend' in a new field called [AttendeeAction].  Then drop the tables [EventSignups] and [EventAttendance].

Then going forward, there is a clean association between EventInfo and Event_Facts.  Picking a single EventID will associate with all the UserIDs of any type of action recorded for that event, and adding additional [AttendeeAction] values like 'Payment' or 'Inquired' or 'Cancelled' can also slot into this system without complicating the data model with new tables.  Going forward you can add any number of additional facts to record about each EventID in a system like this.

If you want to isolate the attendance information from the signup information, use the set-analysis segments [AttendeeAction]={'Signup'} vs. [AttendeeAction]={'Attend'} to separate any one set of fact information from the rest when crafting expressions.

20140805_Event_Tracking_data_model.png

stevelord
Specialist
Specialist
Author

This is a well thought out response and I agree with the distinction between signup and attendance.  The new field to globally clarify that distinction between records, and merging the data to one table with that field in mind, is a nice approach.  **Specifically, your approach neutralizes any concerns about synthetic keys or loops or other wonky stuff that might form if I combine this data by some other approach.***  I will leave it to myself (or subsequent qlikcommunity posts!) to hammer out the specific equations for subsequent analysis if I get stuck later.  (There's a shopping list of requested information I need to develop equations to produce the answers to.)

Two branches on the same tree, some analysis to see which leaves the two branches do or don't have in common...