Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
gdham123
Contributor III
Contributor III

Most Frequent Sequence

Hi guys, I’m trying to get the most popular order for viewing of movies in my library. Visitors get a Viewing ID and are allowed to choose movies to watch for the day. They can watch one movie, two movies, or as many movies as they choose. It’s usually between 1 and 12, as some movies are shorter than others.

I am trying to get the most common order in which the movies are being watched, so the most frequent movie sequence. I’ve tried various types of equations (please see below), but all they succeed in doing is giving me the movie that has appeared at position 1, position 2, position 3, etc. the most often. But what I am actually trying to get is the most frequent set of movie viewed, and in that exact order. e.g. movies 6,13,9,16 (in that order) are viewed more often than any other sequence of movies.

I tried creating a variable vOrderNo1:

=Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,1))"}>}MOVIE_TITLE)

 

Then I’ve tried (as variable 2):

=If(Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,2))"}>}MOVIE_TITLE)<>vOrderNo1,Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,2))"}>}MOVIE_TITLE),
If(Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,3))"}>}MOVIE_TITLE)<>vOrderNo1,Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,3))"}>}MOVIE_TITLE),
If(Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,4))"}>}MOVIE_TITLE)<>vOrderNo1,Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,4))"}>}MOVIE_TITLE),
If(Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,5))"}>}MOVIE_TITLE)<>vOrderNo1,Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,5))"}>}MOVIE_TITLE),
If(Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,6))"}>}MOVIE_TITLE)<>vOrderNo1,Only({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,6))"}>}MOVIE_TITLE))))))

 

I’ve also tried (as variable 3):

=If(Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,3))"}>}MOVIE_TITLE)=vOrderNo2 or Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,3))"}>}MOVIE_TITLE)=vOrderNo1,
If(Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,4))"}>}MOVIE_TITLE)=vOrderNo2 or Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,4))"}>}MOVIE_TITLE)=vOrderNo1,
If(Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,5))"}>}MOVIE_TITLE)=vOrderNo2 or Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,5))"}>}MOVIE_TITLE)=vOrderNo1,
If(Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,6))"}>}MOVIE_TITLE)=vOrderNo2 or Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,6))"}>}MOVIE_TITLE)=vOrderNo1,
Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,7))"}>}MOVIE_TITLE),Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,6))"}>}MOVIE_TITLE)),
Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,5))"}>}MOVIE_TITLE)),Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,4))"}>}MOVIE_TITLE)),
Mode({<ACTUAL_ORDER={"$(=Min(ACTUAL_ORDER,3))"}>}MOVIE_TITLE))

 

None of the above give the desired result. It is extremely challenging.

I’ve attached a small file (just over 1,000 rows) of the type of data I am working with. I have also put a few lines below for ease of reference. Of course none of the sequences below repeat, but in the full file many do.

Thank you for your help.

Date

Viewing ID

Actual_Order

Movie_Title

06-Feb

41388988

1

Forrest Gump (1994)

06-Feb

41388988

2

The Prestige (2006)

06-Feb

41388988

3

 DoubleIndemnity (1944)

08-Feb

41388183

1

The Sting (1973)

08-Feb

41388183

2

Full Metal Jacket (1987)

11-Feb

40285472

1

 ApocalypseNow (1979)

11-Feb

40285472

2

 Rear Window (1954)

11-Feb

40285472

3

City of God (2002, Brazil/Fr)

11-Feb

40285472

4

Witness For the Prosecution (1957)

Labels (1)
1 Solution

Accepted Solutions
jaibau1993
Partner - Creator III
Partner - Creator III

Hi! I'm not sure if this is what you need but I took your data and created a table of sequences, it is, for each ID I concatenated the movies using Actual_Order as sort criteria. The next step is to check which sequence has the most frequence. I attach an example, hope it helps! Jaime.

View solution in original post

2 Replies
jaibau1993
Partner - Creator III
Partner - Creator III

Hi! I'm not sure if this is what you need but I took your data and created a table of sequences, it is, for each ID I concatenated the movies using Actual_Order as sort criteria. The next step is to check which sequence has the most frequence. I attach an example, hope it helps! Jaime.

gdham123
Contributor III
Contributor III
Author

Thanks Jaime, this is an inventive solution that is simple yet eloquent. Thanks for taking the time to figure it out. It works!