Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello data experts,
I have the following example of data where I am unable find out exactly how to model it properly.
The goal
The goal is to analyse how long patient's have walked per day as well as how long they have slept.
The data (sample):
Activity:
Date | Calories Burned | Distance |
01-04-2018 | 2,976 | 6.82 |
02-04-2018 | 3,060 | 7.89 |
03-04-2018 | 2,570 | 7.65 |
04-04-2018 | 2,591 | 7.8 |
05-04-2018 | 2,615 | 8.03 |
06-04-2018 | 2,700 | 32.41 |
07-04-2018 | 4,385 | 27.64 |
08-04-2018 | 3,005 | 10.73 |
09-04-2018 | 2,538 | 23.55 |
10-04-2018 | 2,682 | 34.78 |
11-04-2018 | 2,693 | 54.89 |
12-04-2018 | 2,636 | 24.04 |
13-04-2018 | 3,327 | 11 |
14-04-2018 | 3,402 | 11.06 |
15-04-2018 | 3,304 | 10.36 |
16-04-2018 | 2,552 | 3.66 |
17-04-2018 | 2,963 | 5.89 |
18-04-2018 | 2,517 | 13.61 |
19-04-2018 | 2,566 | 33.9 |
20-04-2018 | 2,846 | 17.23 |
21-04-2018 | 3,196 | 19.07 |
22-04-2018 | 3,105 | 18.81 |
23-04-2018 | 2,690 | 14.78 |
Sleep:
Start Time | End Time | Minutes Asleep |
29-04-2018 10:11 pm | 30-04-2018 6:22 am | 415 |
28-04-2018 11:19 pm | 29-04-2018 7:28 am | 429 |
28-04-2018 12:38 am | 28-04-2018 4:51 am | 237 |
25-04-2018 11:09 pm | 26-04-2018 8:13 am | 515 |
24-04-2018 11:55 pm | 25-04-2018 7:21 am | 382 |
23-04-2018 11:33 pm | 24-04-2018 6:40 am | 440 |
22-04-2018 11:48 pm | 23-04-2018 6:21 am | 389 |
21-04-2018 12:50 pm | 22-04-2018 8:57 am | 474 |
19-04-2018 11:09 pm | 20-04-2018 5:57 am | 379 |
18-04-2018 11:25 pm | 19-04-2018 6:41 am | 383 |
17-04-2018 11:40 pm | 18-04-2018 6:47 am | 414 |
16-04-2018 10:53 pm | 17-04-2018 5:07 am | 333 |
16-04-2018 12:46 am | 16-04-2018 6:27 am | 306 |
14-04-2018 11:45 pm | 15-04-2018 7:29 am | 404 |
14-04-2018 12:02 am | 14-04-2018 7:46 am | 422 |
12-04-2018 11:35 pm | 13-04-2018 7:24 am | 425 |
11-04-2018 11:20 pm | 12-04-2018 6:39 am | 370 |
10-04-2018 11:01 pm | 11-04-2018 7:11 am | 425 |
09-04-2018 10:44 pm | 10-04-2018 6:51 am | 431 |
08-04-2018 11:14 pm | 09-04-2018 6:23 am | 387 |
08-04-2018 12:42 am | 08-04-2018 8:31 am | 423 |
06-04-2018 11:56 pm | 07-04-2018 7:29 am | 425 |
05-04-2018 11:30 pm | 06-04-2018 7:30 am | 443 |
05-04-2018 12:04 am | 05-04-2018 7:00 am | 356 |
03-04-2018 11:06 pm | 04-04-2018 6:22 am | 392 |
03-04-2018 12:43 am | 03-04-2018 6:36 am | 300 |
01-04-2018 09:37 pm | 02-04-2018 8:16 am | 470 |
31-03-2018 10:11 pm | 01-04-2018 10:15 am | 646 |
Problem:
The problem I have is the only thing that links the two data sets above is dates.
It is easy to identify steps taken per day in the Activity table. And you can get the date from the End Date in the Sleep table.
But the sleep data is problematic because I am not sure how to figure out the following logic:
Anyone know how to transform the Sleep data that works for the above logic?
I think the DayStart() function can be a solution for you. It can take a dayshift parameter into account and if you specify 0.5 which is half a day or 12 hours as a dayshift it might be exactly what you need. From your example data I see that you never sleep past noon and you never start sleeping for the new day until several hours after noon...
[ACTIVITY]:
LOAD
Date(Date#([Date], 'DD-MM-YYYY') ) AS [Date],
[Calories Burned],
[Distance]
FROM [lib://ActivityAndSleep]
(html, utf8 , embedded labels, table is @1);
[SLEEP]:
LOAD
*,
Floor(DayStart([Start Time],0,0.5)) AS Date;
LOAD
Timestamp(Timestamp#([Start Time], 'DD-MM-YYYY hh:mm TT') ) AS [Start Time],
Timestamp(Timestamp#([End Time], 'DD-MM-YYYY h:mm TT') ) AS [End Time],
[Minutes Asleep]
FROM [lib://ActivityAndSleep]
(html, utf8 , embedded labels, table is @2);
Note line number 12 where the DayStart() has been used.