Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Sequence of events

Hi all,

i have a simple table, which contains feedback data of people who booked a course.

The table contains (amongst others) the following fields:

- Date (date of submitting the feedback)

- Course type (The type of the course the feedback is about)

- InterestedInOthers (Can be 1 or 0. Determins if the person is also interested in other courses)

- InterestedIn (Can be A and / or B and / or C - where A, B and C are the available course types)

Some example records could look like follows:

FeedbackID
Date          UserIDCourseTypeInterestedInOthersInterestedIn_AInterestedIn_BInterestedIn_C
1112011-06-27JohnA1011
1122011-05-15JackA0000
1132011-07-22JohnB1001
1142011-08-10AlexA1000
1152011-05-13AlexB0000

(The existance of a record for a certain course means that the person submitting the record actually took this course)

When we look at user John we see that he took a course A (in June) first, he was interested in course B and eventually he took course B (in July - AFTER taking course A)

But when we look at user Alex we see that Alex claimed to be interested in course B in August. However at this point in time he already took this course (in May...)

The analysis i try to set up is:

- How many people taking a certain course were interested in other courses. (Easy)

- Out of the number of people above: How many actually booked another course ? --> Here i only want to count those people that took the course they were interested in _AFTER_ the course where they claimed that they were interested. Means for class A the number of people interested in class B would be 2, but only 1 actually took course B (John) - Alex cannot be taken into account, since at the time of submitting the feedback he already too the class that he is interested in.

Question is: How do i do this ?

Thanks for any hints,

K

4 Replies
swuehl
MVP
MVP

kalinkula,

maybe like attached?

I first used a crosstable load to get the InterestedIn_* fields into one field, which makes it easier to calculate the sequence afterwards.

Then I created a second table to hold the courses the Users are interested in, creating additional fields for this DateInterest and CourseTypeTaken.

Courses:

CrossTable(Interested, Data, 5)

LOAD FeedbackID,

     Date,

     [UserID],

     CourseType,

     InterestedInOthers,

     InterestedIn_A,

     InterestedIn_B,

     InterestedIn_C

FROM

[http://community.qlik.com/thread/39745?tstart=0]

(html, codepage is 1252, embedded labels, table is @1);

Check:

LOAD [UserID], CourseType as CourseTypeTaken, Date as DateInterest, subfield(Interested,'_',2) as Interested_In

resident Courses where Data <> 0 order by [UserID], Date asc;

The chart expression (with dimensions CourseTypeTaken and InterestedIn) then looks like:

=aggr( count(distinct if(Date >= DateInterest and CourseType=Interested_In,UserID)),UserID,CourseTypeTaken,Interested_In)

The result seems to be correct to me.

(I can' t see the interest of Alex in course B in your sample data, probably a typo in your table.)

Regards,

Stefan

Not applicable
Author

Hi there,

thanks a lot for the hint, but i have my problems understanding what exactly you did - and why. Can you explain a bit more in detail ? (I am still new to QV. Thanks.)

Also: When i try to map your example solution to my data i end up with a table structure that contains 3 orphan tables (InterestedIn_A, InterestedIn_B and InterestedIn_C) - Why is that ?

Thanks very much for a short reply.

K

Not applicable
Author

OK, solved the issue with the orphans by adding a 'where 1=1 at the end of my LOAD of the Courses table).

Still i would like to understand, what exactly you did and why.

Thanks for some additional feedback.

K

swuehl
MVP
MVP

Ok, I try.

Fist, I use a crosstable load prefix to turn a cross table into a straight table. I am not sure if you are familiar with this load prefix, please have a look at the manual and the Help file, if not. Essentially I am converting the three fields called InterestedIn_A, _B, _C (and any more that might follow in the input table) into two, one called InterestedIn is containing the former field name, the other, Data, is containing the former cell value.

So I am reducing the number of fields by increasing the number of rows. I do this because I want / need one field that contains the information I am looking for: the Course one is interested in.

You can examine the outcome of the crosstable load by looking at the Courses table in table view or in a table box.

I then create a second table using the just created table:

Check:

LOAD [UserID], CourseType as CourseTypeTaken, Date as DateInterest, subfield(Interested,'_',2) as Interested_In

resident Courses where Data <> 0 order by [UserID], Date asc;

Here I just copy the CourseType, Date and create the Interested_In course to match the CourseType format (it is important that the CourseType can be derived from InterestedIn_Type later on).

Using the where clause Data <> 0 I limit the data in the table Check to only the records that show the UserID had an interest in any course.

The last step is performed in the UI:

=aggr( count(distinct if(Date >= DateInterest and CourseType=Interested_In,UserID)),UserID,CourseTypeTaken,Interested_In)

The aggr() function is grouping our data by UserID, CourseTypeTaken and Interested_In field (note that these come from the two different tables). We then count distinct UserID, where the Date (coming from table Courses, from the UserID records), is larger or equal the Date of interest and CourseType (coming from table Courses, from th UserID records) equals Interested_In. Here, table Courses represents the courses really taken, and table Check the courses the users have interest in. I copied the dates of the courses taken to the table with the courses of interest, to be able to iterate independently over these two dates, allowing the comparison needed.

There might be a simpler solution, and if you don't need the results to be sensitive to selections, you could probably move the check into the script.

Hopefully this clarified most things,

Stefan