Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Set Analysis Intersect

I'm have counts of that where I am trying to get an intersection or cohort of all persons in each year.  I would like it to be flexible enough to be able to select more then two years.  In some cases in which I have data I need a five year cohort.  I'm really not find a good resource for documentation on Set Analysis.  I have a PDF from a set analysis course, however, I'm just not getting the syntax correct or something.  This, of course, is giving me the result of multiplying the two counts together and not the intersection.  Essentially, I want only people who belong to both years and how to code it for any number of years selected. Any advice would be helpful.

= Count( { $< Year ={'2010'} >} Distinct PersonKey) * Count( { $< Year ={'2011'} >} Distinct PersonKey)

Thanks in advance!

7 Replies
swuehl
MVP
MVP

With two values, try

= Count( { $< Year ={'2010'} >*< Year ={'2011'} >} Distinct PersonKey)

But have you tried an and-mode selection list box? Check the Help for that.

Not applicable
Author

I can get a return number for each set separately, however, when combining them into the statement above it returns zero.  I cannot use and-mode because it is not available.  According to my research my Year field does not meet the requirements to use and-mode.  Any other suggestions?

swuehl
MVP
MVP

Hm, need to rethink above.

In the meantime, this should work:

=count({<PersonKey =p( {<Year={2011}>})*p({<Year= {2012}>})>}  distinct PersonKey)

And yes, you need to fulfill some requirements to get an and-mode list-box working. Maybe it's possible to extend/change your data model?

Not applicable
Author

That actually works and the numbers check out.  My next issues is how do I make that more dynamic? I want to be able to select multiple years and have it count only where the PersonKey is in every year.  I modified the code as follows which is great for just two years but I'm not sure how to do 3 or more.

count({<PersonKey =p( {<Year={$(=Min(Year))}>})*p({<Year= {$(=Max(Year))}>})>}  distinct PersonKey)

Thanks again!  I really appreciate the expertise!

swuehl
MVP
MVP

IMHO, your last expression will just return the PersonKeys that appears in the max and the min year selected.

Hm, doing this for a dynamic number of selected years could be somewhat complicated, I need to rethink again.

Maybe someone else can step in, I probably can't spend much time on this.

As an alternative, you could try playing around with an extension to your data model that creates a table that fulfills all requirements for an and-mode list box. Try using a distinct load of your PersonKeys with Year field renamed to AndYear (you could rename it again in your list box).

Not applicable
Author

I have verified that it is giving a correct result by using a query on my MS SQL Db.  I do appreciate your help.  I just need to figure out how to make it more dynamic. Thanks again.

danielrozental
Master II
Master II

following Stefan's idea you could do something like this

Create a variable, name it vSET. Variable value should be

='p({<Year={' & concat(distinct Year,'}>})*p({<Year={') & '}>})'

Then your expression should be

=count({<PersonKey=$(vSET)>}  distinct PersonKey)

This will cause the set analysis expression to change when the year selection changes

ie. 2007, 2008, 2009, 2010 selected

=count({<PersonKey=p({<Year={2007}>})*p({<Year={2008}>})*p({<Year={2009}>})*p({<Year={2010}>})>}  distinct PersonKey)

ie. 2007, 2009, 2010 selected

=count({<PersonKey=p({<Year={2007}>})*p({<Year={2009}>})*p({<Year={2010}>})>}  distinct PersonKey)