I've got a request for a report from a customer of ours and I'm struggling a bit with getting the statistics..
So, what the customer would like to get is to calculate the number of hours between different times of day. I have an events table and some dimensions, so the data should be enough. I'm just trying to find out what the best way to do this would be.
The end result should be something like
Time of day | Number of hours sold
00 | 400
01 | 300
02 | 243
03 | 507
I have today transactions that have a startdate and a stopdate, so that table sort of looks like:
ID; StartDateTime; StopDateTime; Duration (essentialy stopdate-startdate)
The StartDateTime and StopDateTime can run over days and obviously over hours and stop whenever so a normal transaction can look like:
1; 2012-09-01 14:45; 2012-09-01 17:10;165
I have a time dimensions table connected to the events table, but I'm not an expert (at all) on how the events table works.
My first thought was to sort of create a separate mapping table stating that this ID has this hour, and this hour like (for the example transaction above):
ID; hour; amount in hour
1; 14; 15
1; 15; 60
1; 16; 60
1; 17; 10
Another issue is that I have a data set of about 4,5 mil transactions which I have to do this on, and some of these would then recieve at max 24 rows and in average probably like 5 / transaction so that table would be 22,5 mil rows, which might be tough for the server to take care of :/ (have a 24 core 32 gb ram server)
Does anyone have an idea on how this issue can be solved in a better / any other way ?