Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
I am trying to figure out how to make a count of and then find the max of only groups of sequential numbers. Ideally for the load script but in a function is fine too.
The data looks like this...
PatientID , TreatmentDay
1 , 1
1 , 2
1 ,3
1 ,4
1 ,6
1 ,7
1 ,8
2 ,1
2 ,2
2 ,3
2 ,4
2 ,5
The correct value for patient1 would be 4 and for patient 2 5
I attach a xls to this post with a basic data set in,
Any help appreciated.
Ok, I hope thats final.
input:
LOAD TESTSYSTEMID,
DOSINGDAY
FROM
[day seq.xlsx]
(ooxml, embedded labels, table is Sheet1);
seqfalgs:
LOAD
TESTSYSTEMID,
DOSINGDAY,
if(TESTSYSTEMID=previous(TESTSYSTEMID), if(DOSINGDAY-previous(DOSINGDAY)=1, 1, 0), 0) as SEQUENTAL
Resident input
order by TESTSYSTEMID, DOSINGDAY;
drop table;
seqstring:
LOAD
TESTSYSTEMID,
Concat(SEQUENTAL, '', DOSINGDAY) as SEQSTRING
Resident seqfalgs
group by TESTSYSTEMID;
sunseqs:
LOAD
TESTSYSTEMID,
len(SubField(SEQSTRING, '0'))+1 as SEQLEN
Resident seqstring;
LOAD
TESTSYSTEMID,
Max(SEQLEN) as MAXSEQLEN
Resident sunseqs
group by TESTSYSTEMID;
input:
LOAD TESTSYSTEMID,
DOSINGDAY
FROM
[day seq.xlsx]
(ooxml, embedded labels, table is Sheet1);
seqfalgs:
LOAD
TESTSYSTEMID,
DOSINGDAY,
if((DOSINGDAY=previous(DOSINGDAY)+1 and TESTSYSTEMID=previous(TESTSYSTEMID) and peek(SEQUENTAL)<>0) or
(TESTSYSTEMID<>previous(TESTSYSTEMID)), 1, 0) as SEQUENTAL
Resident input;
drop table;
seqcount:
LOAD
TESTSYSTEMID,
SUM(SEQUENTAL) as SEQCOUNT
Resident seqfalgs
group by TESTSYSTEMID;
Whiteline,
Thanks so much. But it doesnt seem to work correctly. There are only SEQENTAL counts of 0 and 1. I'll dive in to your script and have a think.
Cheers
Mark
Look carefuly.
The result is in SEQCOUNT.
It seems to be the ...
or (TESTSYSTEMID<>previous(TESTSYSTEMID))
statement that was screwing things around. I've removed this and things are looking pretty rosy.
I've just got to find the max of unique SEQENTALs and things are good.
I think you should replace it with
or (TESTSYSTEMID<>previous(TESTSYSTEMID) and DOSINGDAY=1)
or it will skip all first values.
This is looking good. But I have an issue when I tried it with a bigger data set.
If one of the patients has 3 days of sequential treatments then a break then 5 days of sequential treatments the function we have gives a value of 8. Whereas the value I am after is 5 in that case.
I will amend the data set in the original xls file so that patient1 has this issue.
Cheers
Mark
I've already post you the right expression:
if((DOSINGDAY=previous(DOSINGDAY)+1 and TESTSYSTEMID=previous(TESTSYSTEMID) and
peek(SEQUENTAL)<>0) or (TESTSYSTEMID<>previous(TESTSYSTEMID) and DOSINGDAY=1), 1, 0) as SEQUENTAL
Thanks Whiteline,
But when I run the expression as it stands I get in SEQCOUNT a count of DOSINGDAY (albeit as a count (SEQUENTAL)), not a count of the maximum sequential days that a patient recieved treatment.
Cheers!
What is wrong ?