I'm trying to perform a function on a column, but only where certain conditions are met.
The example is a healthcare dataset -- I have ~100,000 unique patients, and I have ~700,000 unique encounters with those patients.
I've created a variable called "VISIT_ORDER," which is simply a ranking of the patients visit sequence. I also have a variable which specifies the patient's dialysis status on the encounter level: whether the patient is not currently on dialysis, if they currently are on dialysis, or if they've never been on dialysis during the time window.
I'm trying to average a column with the first visit a patient has (my thought was minimizing VISIT_ORDER) after they've been on dialysis. In my set analysis, I want to return the result of the row where VISIT_ORDER = the minimum VISIT_ORDER where DIALYSIS_DX_GROUPING = 'AFTER DIALYSIS', then to average COUNT_ADMITS_365 for those rows.
I think what I need to do is some variant of this below, but while what I have works for one patient (when I've been troubleshooting), I'm having issues: