Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have been tasked to determine the Interval between dates of a PM Schedule. Here is a sample of my data:
EventScheduleID | EventStartDateTime | Frequency |
100 | 7/6/2015 | Weekly |
100 | 7/13/2015 | Weekly |
100 | 7/20/2015 | Weekly |
103 | 9/1/2015 | Monthly |
103 | 3/1/2016 | Monthly |
103 | 9/1/2016 | Monthly |
103 | 3/1/2017 | Monthly |
In the database of my client stores the frequency you can see Weekly and Monthly in example. There might be some schedules that are Daily and Yearly as well. There is no Quarterly, Bi-Monthly, Semi-Annually. The way they have to setup Quarterly as you can predict is Monthly and then select every 3 months. Is there a formula that create the words Quarterly, Semi-Annually, etc. I have all the schedules and the start dates for them going out 2 years. So in the example above for Schedule 103 I want to change on the front end Monthly to Semi-Annually. I hope this makes sense. Any suggestions are welcome.
David
Is the fact the there are 6 months between the two rows for the EventScheduleID makes the frequency Semi-Annually? You need this to be on the front end? What kind of object?
To answer your first question the answer is yes. So if my client runs a report on the schedules what they are going to get in return is:
EventScheduleID Frequency
100 Weekly
103 Monthly
100 is fine because it is Weekly. But 103 on the Front ends shows Monthly when it is actually Semi-Annually. I hope this makes sense. I would like to have an Excel output showing Semi-Monthly instead of Monthly. Currently they are opening each schedule. Go to the frequency section and they will see every 6 months. That is a manually process. That why I was looking for a formula.
David
You might want to correct your source data instead of trying to repair it in the front end.
regards
Marco
if I could I would. It is the product data and that is not going to be changed. If there is no formula I am ok with it.
Hi David
There is a way you can achieve this by calculating the average period between each event for each EventScheduleID and converting that to an interval which can be decoded into a frequency.
I have only had a quick look at this to prove the method and offer a possible principle of operation so you may find you need to do some sensitivity analysis on it when e.g. events fall out of sync with their frequency or change frequency. If your intervals are fairly stable it should offer something of a solution. Hopefully it will serve as a starting point to a better solution.
In essence the steps would be:
To try out my proposed solution:
Here is the code. I've written notes to help explain what happens at each stage.
//Step 1: Load the raw data
tblData_temp:
Load * inline [
LineID,EventScheduleID, EventStartDateTime, Frequency
1,100,07/06/2015,Weekly
2,100,07/13/2015,Weekly
3,100,07/20/2015,Weekly
4,103,09/01/2015,Monthly
5,103,03/01/2016,Monthly
6,103,09/01/2016,Monthly
7,103,03/01/2017,Monthly
8,104,02/05/2016,Daily
9,104,02/06/2016,Daily
10,104,02/09/2016,Daily
11,104,02/10/2016,Daily
12,104,02/11/2016,Daily
13,105,02/15/2014,Annually
14,105,02/07/2015,Annually
15,105,02/11/2016,Annually
16,106,02/11/2016,Bi-Monthly
17,106,04/15/2016,Bi-Monthly
18,106,06/05/2016,Bi-Monthly
19,106,08/11/2016,Bi-Monthly
20,107,02/11/2015,Quarterly
21,107,05/05/2015,Quarterly
22,107,08/16/2015,Quarterly
23,107,11/11/2015,Quarterly
24,107,02/11/2016,Quarterly
];
//Step 2: Format the dates to ensure they are proper dates
//(You may not need this step if your dates are already in a proper date format)
left join (tblData_temp)
Load
LineID
,Date(MakeDate(Right(EventStartDateTime,4),left(EventStartDateTime,2),mid(EventStartDateTime,4,2)),'DD/MM/YYYY') as EventDate
Resident tblData_temp;
//Step 3: Sort the raw data to get it into the correct sort order for processing
tblData:
noconcatenate
Load
LineID,
EventScheduleID,
EventStartDateTime,
EventDate,
Frequency
Resident tblData_temp
order by EventScheduleID,EventDate asc
;
//Step 4: Drop the old raw data
Drop table tblData_temp;
//Step 5: For each EventScheduleID calculate the number of days since the last event.
//This sets to 0 for the first instance of each new EventScheduleID
left join (tblData)
Load
LineID,
If(EventScheduleID=Previous(EventScheduleID),EventDate-Previous(EventDate),0) as DaysSinceLast
Resident tblData;
//Step 6: Calculate the intervals of the year based on a 365 day year
left join (tblData)
Load
EventScheduleID,
Sum(DaysSinceLast)/(Count(DaysSinceLast)-1) as averageDays,
365/Floor(Sum(DaysSinceLast)/(Count(DaysSinceLast)-1)) as averageInterval,
Floor(365/Floor(Sum(DaysSinceLast)/(Count(DaysSinceLast)-1))) as averageEventsPerYear
Resident tblData
Group By EventScheduleID;
//Step 7: Decode the averageEventsPerYear values into text versions of the frequency.
//You may need to change this into ranges rather than absolute values.
left join (tblData)
Load
LineID
,If(averageEventsPerYear=365,'Daily'
,If(averageEventsPerYear=52,'Weekly'
,If(averageEventsPerYear=6,'Bi-Monthly'
,If(averageEventsPerYear=4,'Quarterly'
,If(averageEventsPerYear=2,'Semi-Annually'
,If(averageEventsPerYear=1,'Annually','Check')))))) as calculatedFrequency
Resident tblData;
The results will look something like this:
EventSchedule ID | EventStart DateTime | EventDate | Frequency | DaysSince Last | average Days | average Interval | averageEvents PerYear | calculated Frequency |
100 | 07/06/2015 | 06/07/2015 | Weekly | 0 | 7 | 52.14285714 | 52 | Weekly |
100 | 07/13/2015 | 13/07/2015 | Weekly | 7 | 7 | 52.14285714 | 52 | Weekly |
100 | 07/20/2015 | 20/07/2015 | Weekly | 7 | 7 | 52.14285714 | 52 | Weekly |
103 | 09/01/2015 | 01/09/2015 | Monthly | 0 | 182.3333333 | 2.005494505 | 2 | Semi-Annually |
103 | 03/01/2016 | 01/03/2016 | Monthly | 182 | 182.3333333 | 2.005494505 | 2 | Semi-Annually |
103 | 09/01/2016 | 01/09/2016 | Monthly | 184 | 182.3333333 | 2.005494505 | 2 | Semi-Annually |
103 | 03/01/2017 | 01/03/2017 | Monthly | 181 | 182.3333333 | 2.005494505 | 2 | Semi-Annually |
104 | 02/05/2016 | 05/02/2016 | Daily | 0 | 1.5 | 365 | 365 | Daily |
104 | 02/06/2016 | 06/02/2016 | Daily | 1 | 1.5 | 365 | 365 | Daily |
104 | 02/09/2016 | 09/02/2016 | Daily | 3 | 1.5 | 365 | 365 | Daily |
104 | 02/10/2016 | 10/02/2016 | Daily | 1 | 1.5 | 365 | 365 | Daily |
104 | 02/11/2016 | 11/02/2016 | Daily | 1 | 1.5 | 365 | 365 | Daily |
105 | 02/15/2014 | 15/02/2014 | Annually | 0 | 363 | 1.005509642 | 1 | Annually |
105 | 02/07/2015 | 07/02/2015 | Annually | 357 | 363 | 1.005509642 | 1 | Annually |
105 | 02/11/2016 | 11/02/2016 | Annually | 369 | 363 | 1.005509642 | 1 | Annually |
106 | 02/11/2016 | 11/02/2016 | Bi-Monthly | 0 | 60.66666667 | 6.083333333 | 6 | Bi-Monthly |
106 | 04/15/2016 | 15/04/2016 | Bi-Monthly | 64 | 60.66666667 | 6.083333333 | 6 | Bi-Monthly |
106 | 06/05/2016 | 05/06/2016 | Bi-Monthly | 51 | 60.66666667 | 6.083333333 | 6 | Bi-Monthly |
106 | 08/11/2016 | 11/08/2016 | Bi-Monthly | 67 | 60.66666667 | 6.083333333 | 6 | Bi-Monthly |
107 | 02/11/2015 | 11/02/2015 | Quarterly | 0 | 91.25 | 4.010989011 | 4 | Quarterly |
107 | 05/05/2015 | 05/05/2015 | Quarterly | 83 | 91.25 | 4.010989011 | 4 | Quarterly |
107 | 08/16/2015 | 16/08/2015 | Quarterly | 103 | 91.25 | 4.010989011 | 4 | Quarterly |
107 | 11/11/2015 | 11/11/2015 | Quarterly | 87 | 91.25 | 4.010989011 | 4 | Quarterly |
107 | 02/11/2016 | 11/02/2016 | Quarterly | 92 | 91.25 | 4.010989011 | 4 | Quarterly |
As you can see it has corrected 103 from monthly to Semi-Annually. I hope that offers some food for thought on how you could approach the problem.
Kind regards
Steve
Thanks Steve. It worked perfectly. Just had to make some adjustments to the calculatefrequency options but other than that it got me close. Awesome.....
Hi Steven,
Your solution I finally put into place. The problem I ran into was it returned a bunch of checks. Numbers like:
averageEventsPerYear |
8 |
3 |
1 |
5 |
26 |
13 |
11 |
0 |
My question is it I check them and assign a calculated Frequency will those numbers always be the same answer. You solution has been awesome but I want some additional advice.
David
Hi David
I'm pleased my idea has been of some help. As mentioned in my original post this solution will work most effectively if your intervals are fairly stable, i.e. there are approximately the same number of days in between each event for each event schedule id.
To answer your question, as this model is based on regular day intervals then it will always provide consistent answers, however, if those answers do not fit with your interval model then the ID's will emerge as 'Check' exceptions. Consequently, if you hard code the figures shown in your table the larger values are likely to be relatively accurate but the lower values may be more at risk of error.
If you have widely fluctuating event intervals in your data then it will be difficult to categorise them. From the values shown in your chart it does seem to suggest that you have some EventScheduleID's where the frequencies are quite unstable and/or the frequencies have changed periodically which would skew the results.
For example, you would expect that a weekly event took place every 7 days, providing approximately 52 events per year, while a Semi-annual event took place every 182.5 days providing 2 events per year etc. So if your data has intervals which are not stable you will find that your average days figure will drift from these expected values and the larger the variation then the wider the drift.
There a few ways to deal with this. E.g.
Sensitivity analysis on your data would help you to identify firstly whether or not this solution will actually work for you and also whether you can define a set of ranges that will give you reasonable answers. The check indicator will allow you to identify the outliers in your results set and analyse the degree of drift from the expected values. There is always a risk that a widely varying data set will produce a valid result suggesting a frequency which is actually not correct.
Once you know the drift you can then assign a set of ranges instead of absolute values in the nested if statement to allow a wider range of values to be categorised into a group. This will be more forgiving for the longer intervals than the shorter ones as e.g. the averages would have to drift a lot to move from 182.5 to 365 days but could more easily drift from 1 to 7 days
So for example you could replace the nested if code block with something like:
//Step 7: Decode the averageEventsPerYear values into text versions of the frequency using ranges.
left join (tblData)
Load
LineID
,If(averageEventsPerYear>=335 and averageEventsPerYear <=395,'Daily'
,If(averageEventsPerYear>=49 and averageEventsPerYear<=60,'Weekly'
,If(averageEventsPerYear>=5 and averageEventsPerYear<=7,'Bi-Monthly'
,If(averageEventsPerYear>=3 and averageEventsPerYear<5,'Quarterly'
,If(averageEventsPerYear>=1.5 and averageEventsPerYear<3,'Semi-Annually'
,If(averageEventsPerYear>=0.5 and averageEventsPerYear<1.5,'Annually','Check')))))) as calculatedFrequencyRangeMethod
Resident tblData;
I have just used arbitrary values in the above example code to demonstrate the principle, you may need to change them to suit your model. It effectively creates a series of frequency bands into which the Event Schedule ID's can be categorised.
With the data set used in the previous example plus two new, less stable id's this would give the following results:
Event Schedule ID | Event Date | Days Since Last | average Days | average Interval | average Events Per Year | calculated Frequency | calculated Frequency Range Method |
100 | 06/07/2015 | 0 | 7 | 52.14285714 | 52 | Weekly | Weekly |
100 | 13/07/2015 | 7 | 7 | 52.14285714 | 52 | Weekly | Weekly |
100 | 20/07/2015 | 7 | 7 | 52.14285714 | 52 | Weekly | Weekly |
103 | 01/09/2015 | 0 | 182.3333333 | 2.005494505 | 2 | Semi-Annually | Semi-Annually |
103 | 01/03/2016 | 182 | 182.3333333 | 2.005494505 | 2 | Semi-Annually | Semi-Annually |
103 | 01/09/2016 | 184 | 182.3333333 | 2.005494505 | 2 | Semi-Annually | Semi-Annually |
103 | 01/03/2017 | 181 | 182.3333333 | 2.005494505 | 2 | Semi-Annually | Semi-Annually |
104 | 05/02/2016 | 0 | 1.5 | 365 | 365 | Daily | Daily |
104 | 06/02/2016 | 1 | 1.5 | 365 | 365 | Daily | Daily |
104 | 09/02/2016 | 3 | 1.5 | 365 | 365 | Daily | Daily |
104 | 10/02/2016 | 1 | 1.5 | 365 | 365 | Daily | Daily |
104 | 11/02/2016 | 1 | 1.5 | 365 | 365 | Daily | Daily |
105 | 15/02/2014 | 0 | 363 | 1.005509642 | 1 | Annually | Annually |
105 | 07/02/2015 | 357 | 363 | 1.005509642 | 1 | Annually | Annually |
105 | 11/02/2016 | 369 | 363 | 1.005509642 | 1 | Annually | Annually |
106 | 11/02/2016 | 0 | 60.66666667 | 6.083333333 | 6 | Bi-Monthly | Bi-Monthly |
106 | 15/04/2016 | 64 | 60.66666667 | 6.083333333 | 6 | Bi-Monthly | Bi-Monthly |
106 | 05/06/2016 | 51 | 60.66666667 | 6.083333333 | 6 | Bi-Monthly | Bi-Monthly |
106 | 11/08/2016 | 67 | 60.66666667 | 6.083333333 | 6 | Bi-Monthly | Bi-Monthly |
107 | 11/02/2015 | 0 | 91.25 | 4.010989011 | 4 | Quarterly | Quarterly |
107 | 05/05/2015 | 83 | 91.25 | 4.010989011 | 4 | Quarterly | Quarterly |
107 | 16/08/2015 | 103 | 91.25 | 4.010989011 | 4 | Quarterly | Quarterly |
107 | 11/11/2015 | 87 | 91.25 | 4.010989011 | 4 | Quarterly | Quarterly |
107 | 11/02/2016 | 92 | 91.25 | 4.010989011 | 4 | Quarterly | Quarterly |
108 | 02/12/2016 | 0 | 3.333333333 | 121.6666667 | 121 | Check | Check |
108 | 05/12/2016 | 3 | 3.333333333 | 121.6666667 | 121 | Check | Check |
108 | 08/12/2016 | 3 | 3.333333333 | 121.6666667 | 121 | Check | Check |
108 | 12/12/2016 | 4 | 3.333333333 | 121.6666667 | 121 | Check | Check |
109 | 01/02/2017 | 0 | 67.66666667 | 5.447761194 | 5 | Check | Bi-Monthly |
109 | 01/04/2017 | 59 | 67.66666667 | 5.447761194 | 5 | Check | Bi-Monthly |
109 | 12/06/2017 | 72 | 67.66666667 | 5.447761194 | 5 | Check | Bi-Monthly |
109 | 23/08/2017 | 72 | 67.66666667 | 5.447761194 | 5 | Check | Bi-Monthly |
In the above example the stable events have again been categorised correctly using the range method. The slightly less stable id 109 has also now been categorise correctly, however id 108 remains at 'Check'. This is correct as there is no frequency band in your model for events which take place 3 times per year i.e. approximately every 122 days, so the report has correctly highlighted id 108 as an exception.
I hope that helps to get you a step further with your analysis.
Kind regards
Steve
Hi Steve,
Again thank you so much. I was hoping I could adjust the formula like you did. I have it now down to just a few outliers that I have to work on.
David