Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
dclark0699
Creator
Creator

Set Analysis To Look at Data In Next Year

With this data:

PLAYERS:

LOAD * INLINE [

    playerID, HR, 2B, 3B, yearID

    bautijo02, 54, 35, 3, 2010

    bautijo02, 43, 24, 2, 2011

    beltrad01, 48, 32, 0, 2004

    beltrca01, 41, 38, 1, 2006

    berkmla01, 42, 35, 2, 2002

    berkmla01, 45, 29, 0, 2006

    bondsba01, 73, 32, 2, 2001

    bondsba01, 46, 31, 2, 2002

    bondsba01, 45, 22, 1, 2003

    bondsba01, 45, 27, 3, 2004

];

Is it possible to set up set analysis to return a list of players who hit at least 40 HR and then the following calendar year also hit at least 40 HR?

The expected results being

bautijo02

bondsba01

And then if that is possible, can you also say, within four years of hitting at least 40 HR, they hit at least 40 HR again?

The expected results  being

bautijo02

bondsba01

berkmla01

Thanks for any advice you can provide!

11 Replies
anbu1984
Master III
Master III

In Script,

PLAYERS:

LOAD * INLINE [

    playerID, HR, 2B, 3B, yearID

    bautijo02, 54, 35, 3, 2010

    bautijo02, 43, 24, 2, 2011

    beltrad01, 48, 32, 0, 2004

    beltrca01, 41, 38, 1, 2006

    berkmla01, 42, 35, 2, 2002

    berkmla01, 45, 29, 0, 2006

    bondsba01, 73, 32, 2, 2001

    bondsba01, 46, 31, 2, 2002

    bondsba01, 45, 22, 1, 2003

    bondsba01, 45, 27, 3, 2004

];

NoConcatenate

Final:

Load *,If(playerID = Previous(playerID) And yearID = Previous(yearID)+1 And HR > 40 And Previous(HR) > 40, 1) As Flag

Resident PLAYERS Order by playerID,yearID;


Drop Table PLAYERS;


Based on Flag=1, you can retrieve playerID

jagan
Luminary Alumni
Luminary Alumni

Hi,

Try like this

Temp:

LOAD * INLINE [

    playerID, HR, 2B, 3B, yearID

    bautijo02, 54, 35, 3, 2010

    bautijo02, 43, 24, 2, 2011

    beltrad01, 48, 32, 0, 2004

    beltrca01, 41, 38, 1, 2006

    berkmla01, 42, 35, 2, 2002

    berkmla01, 45, 29, 0, 2006

    bondsba01, 73, 32, 2, 2001

    bondsba01, 46, 31, 2, 2002

    bondsba01, 45, 22, 1, 2003

    bondsba01, 45, 27, 3, 2004

];

Players:

LOAD

*,

If(Peek('playerID') = playerID AND Peek('HR') >= 40 AND HR >= 40, 1, 0) AS Flag

FROM Temp

ORDER BY playerID, yearID;


Drop Table Temp;


Now if you want to see a list of players who hit at least 40 HR and then the following calendar year also hit at least 40 HR just select Flag = 1 then you will see the players.


Hope this helps you.

Regards,

Jagan.

dclark0699
Creator
Creator
Author

While, yes this would solve this particular example, I'm looking for a way to do it without flags in the load script. For my particular case, I need it to be more dynamic than a flag on the script.

For example, I might place an input box where the user can say 1 year between, 4 years between, 6 years between etc. And I like to be able to do it against different fields. It would require potentially infinite flags.

dclark0699
Creator
Creator
Author

While, yes this would solve this particular example, I'm looking for a way to do it without flags in the load script. For my particular case, I need it to be more dynamic than a flag on the script.

For example, I might place an input box where the user can say 1 year between, 4 years between, 6 years between etc. And I like to be able to do it against different fields. It would require potentially infinite flags.

Not applicable

If you want to do it in Front end.

The Create a List box, with calculated expression as : =Class(HR,5)

Also you can add this in Straight Table Expression with PlayerID and YearID as dimensions to create bucket.

Otherwise in script level use Interval Match function to create a customized bucket and map your data with that.

Hope this was helpful.

Thanks,

Singh

dclark0699
Creator
Creator
Author

This is a very interesting function that I've never used before, but I don't see how it will calculate the number of years between two 40 HR seasons. It classifies the number of HRs into buckets x wide instead of checking x years into the future

Not applicable

Can you please explain more about this scenario based on YearID. My observation is that the expected results in the second case are where we have more than 1 entry for PlayerID with atleast 40 as HR.

Can you correct me, if I have misunderstood the requirement?

dclark0699
Creator
Creator
Author

I'm looking at how long the gap was between yearID. In the second example, I was testing who had hit 40+ HR within 4 years of hitting 40+ HR

bautijo02 2011-2010 = 1

bondsba01 2004-2003 = 1, 2003-2002 = 1, 2002-2001 = 1

berkmla01 2006-2002 = 4

So If I changed an input field from 4 to 3 it would return only

bautijo02 2011-2010 = 1

bondsba01 2004-2003 = 1, 2003-2002 = 1, 2002-2001 = 1

Because it took berkmla01 4 years to hit 40+ HR again

Does that make sense?

Not applicable

It does make sense now,

Try below steps:

Script Change:

table1:

LOAD

playerID,

max(yearID)-min(yearID) as YearFlag

Resident PLAYERS

group by playerID;

FrontEnd:

Create a Straight Table

Dimensions: playerID,yearID and HR

Expression: only({<YearFlag= {"<=$(=vYearRange)"}>}playerID) and Hide it this expression in Presentation Tab.

Filters:

vYearRange is a input variable, where you will input a number.

=Class(HR,5) as a list box to filter the HR thing

This is what I could make up, hope this will be helpful.

Thanks,

Singh