Skip to main content
Announcements
Qlik Connect 2025: 3 days of full immersion in data, analytics, and AI. May 13-15 | Orlando, FL: Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Sequence analysis (aggr vs Group By?)

Hello Everyone,

I would like to analyze data but I'm not really sure how to start this analysis.

I'd like to count the number of "Start" in column "Type" only if the session x contains at least 1 type "Start", 1 type "Action" and 1 type "End".

Session_IDType
1

Start

1End
2Start
2Action
2Action
2End
3Start
3Action
3End

Typically, in the example hereabove, i would get 2 as session 1 is not considered as valid.

Here is where I can't get my mind to it. I think the usage of group by doing the import is not good as I would also need the sessions not to be grouped (so I would have to make several imports). Now, if I'd like to use aggr, I would need to merge all "Types" rows for a session and check for the string "Start, Action* End" (but I don't really know how to do.

What do you guys think would be the best option? Also, how can I merge the "type" rows to do what I want?

Thank you for your help.

Jonathan

1 Solution

Accepted Solutions
sunny_talwar

You can also try this

=Count(DISTINCT {<Session_ID = P({<Type = {'End'}>})*P({<Type = {'Start'}>})*P({<Type = {'Action'}>})>} Session_ID)

Capture.PNG

View solution in original post

7 Replies
swuehl
MVP
MVP

Maybe something like

=Sum( Aggr( If(Concat(DISTINCT Type,'|') ='Action|End|Start',1,0), Session_ID))

Though this is not checking if the Types are appearing exactely once (you can remove the DISTINCT to do this).

Anonymous
Not applicable
Author

Well, there is a long but simpler way on script side: first calculate distinct type count per session id (must be something like LOAD Session_ID, COUNT(DISTINCT Type) AS Type_Count FROM [Your Table] GROUP BY Session_ID), then you can select which one has 3 count (in this case, it is where Type_Count column equals to 3).

sunny_talwar

You can also try this

=Count(DISTINCT {<Session_ID = P({<Type = {'End'}>})*P({<Type = {'Start'}>})*P({<Type = {'Action'}>})>} Session_ID)

Capture.PNG

Anonymous
Not applicable
Author

Hello,

Thank you for your answer.

The thing is, I don't always have 3 counts and 3 counts are not necessarily good (Start, Error, End for instance).

Is it possible to add the distinction of type (text) in the load?

Thanks,

Jonathan

Anonymous
Not applicable
Author

Hello Stefan,

Is this valid if I have 1 action or more like using joker characters?

Thanks,

Jonathan

Anonymous
Not applicable
Author

Hello Sunny,

Thank you it works pretty well!

However, I'd like to understand a bit more how this works.

Does it count sessions containing a Start, Action and End type?

What does the P({}) mean?

Thank you for your answer,

Jonathan

sunny_talwar

Yes, the * sign between the three p() means the intersection. The set analysis is saying that include only those session_ids where all three types exists. Look here for p() and e(): P() &amp; E() and where do you use them?

For more advanced set analysis, check here

Set Analysis: syntaxes, examples