T-Tests and Set Analysis

MattM · ‎2021-07-23

I have a data set with a series of patient encounters where the patient went from a hospital into a nursing facility and the number of days they stayed in the nursing facility like this:

Patient	Hospital	Nursing Facility	Days
1	Hospital A	Facility 1	20
2	Hospital A	Facility 1	12
3	Hospital B	Facility 1	23
4	Hospital B	Facility 2	45

And it's rolled up like this:

Hospital	Nursing Facility	Average Length of Stay
Hospital A	Facility 1	35
Hospital A	Facility 2	32
Hospital B	Facility 2	22
Hospital C	Facility 3	25
Hospital C	Facility 1	45

And I've brought in some comparison data as well where both the Hospital and Nursing Facility are labelled 'Comparison':

Hospital	Nursing Facility	Average Length of Stay
Hospital A	Facility 1	35
Hospital A	Facility 2	32
Hospital B	Facility 2	22
Hospital C	Facility 3	25
Hospital C	Facility 1	45
Comparison	Comparison	24

What I want to do is add a column that gives me whether the current row is statistically significantly different than the comparison data with a t-test:

Hospital	Nursing Facility	Average Length of Stay	p value of difference from comparison
Hospital A	Facility 1	35	.002
Hospital A	Facility 2	32	.007
Hospital B	Facility 2	22	.2
Hospital C	Facility 3	25	.4
Hospital C	Facility 1	45	.000001
Comparison	Comparison	24	-

And I'm having trouble with the set analysis necessary to make this happen. TTest_sig requires a group field, and a value field. Those should be the NursingFacility and LegnthOfStay. It also requires that the group field have 2 and only 2 groups in it. What I want to do conceptually is something like this:

TTest_sig({<current row>+<comparison data>} NursingFacility,LengthOfStay)

But the set analysis syntax totally escapes me. Any help would be very much appreciated.

Application Development

Creating Analytics