I have a table and I would like to include only user ids that reached Stage 3(users can go into Stage 1, then Stage 2 and so on, but it's possible a user stages stops at Stage 2. however, I would like to make sure I have only users who went to Stage 1, Stage 2 and Stage 3). One user can't get to Stage 3 without going into Stage 1 and then Stage 2.
Looks like you have spaces in the stage field, so first, you should prepare your data, for instance, remove text and spaces.
Try the following:
//Create a new field that contains only the numeric part of the stage field Data: LOAD *, KeepChar(stage, '0123456789 ') AS StageNum; Load * inline [ UserID,UserName,date,stage 1, abc, 2022-01-01, Stage1 1, abc, 2022-02-01, Stage2 1, abc, 2022-05-01, Stage3 1, abc, 2022-05-01, Stage4 2, hgb, 2022-01-01, Stage1 2, hgb, 2022-02-01, Stage2 2, hgb, 2022-05-01, Stage3 2, hgb, 2022-05-01, Stage 4 3, hgh, 2022-01-01, Stage1 3, hgh, 2022-02-01, Stage2 3, hgh, 2022-05-01, Stage3 4, gdf, 2022-05-01, Stage 1 ];
//Find max StageNum for each user //Join UserMaxStage to Data table //UserMaxStage can be used as a filter in set analysis Left Join(Data) MaxStage: LOAD UserID, Max(StageNum) AS UserMaxStage Resident Data Group By UserID;
//Drop StageNum field from the Data Model DROP FIELD StageNum;
with selected value:
Press LIKE if the given solution helps to solve the problem. If it's possible please mark correct answers as "solutions" (you can mark up to 3 "solutions").
By using exists() the load-order within the script is important because the check happens against all already loaded field-values. This means you need to control the load-order whereby they mustn't be mandatory changed because you could also use a (multiples) renaming of the field and/or using exists() with two parameters. In regard to my example from above it may look like:
temp: load distinct UserID as UserIDForExists from X where stage = 'Stage3';
final: load *, exists(UserIDForExists, UserID) as StageFlag from X;
This is actually working, can this be achieved to create another pivot table where I can include only users that have reached Stage 4? (mentioning that a user can go above Stage 4 like Stage 5, Stage 6)
If I do another join where max(stage)=stage 4 is not working