Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi guys ,
I need your little help.
I have start_time of different jobs in my model and I want to generate the average start time for these jobs.
Can anyone help?
Starttime of jobs is in format 7:23:05 AM
Thanks in advance.
With only this information, a simple avg(starttime) with the job dimension should do the job.
What are you trying to get on the graph your using.
I just need the data in the pivot table. I need to calculate the avg start time. But I am not getting appropriate result using above expression. Some of the jobs have null start time values which I need to exclude. There are multiple jobs that are forming a source so I need to aggregate that and where ever there are null entries for the job start run I need to exclude that.
Suppose the job timings for a month are 12:40:00 AM , 12:20:00 , NULL, 11:00:00 AM then I need avg of These timings . But I am getting results like 9 am.
I am using the below expression
.
Aggr(time((avg(num(starttime))),jobid)
Can anyone help me with the correct expression.
You need to use another avg for your expressoin:
avg(Aggr(avg(num(starttime)),jobid))
Just the: Aggr(time((avg(num(starttime))),jobid)
will get you a temporary "table" with the results of the avg by jobid.
Actually i am looking for a way to exclude the null values using some set expression. If you could help
Example of what the aggr will do.
Lets say you have this dataset:
Data:
Load * inline
[
Date,JobId,starttime
01/01/2018,JobA,9:45
02/01/2018,JobA,7:50
01/01/2018,JobB,19:45
04/01/2018,JobB,7:50
05/01/2018,JobC,9:45
01/01/2018,JobC,7:50
]
And you use the expression:
avg
(
aggr
(
sum(starttime),
JobId
)
)
What this will do first is sum(starttime) grouped by JobId, giving the following:
JobId | sum(starttime) |
---|---|
62:45:00 | |
JobA | 17:35:00 |
JobB | 27:35:00 |
JobC | 17:35:00 |
And then apply the avg on this table, getting
JobId | avg ( aggr ( sum(starttime), JobId ) ) |
---|---|
20:55:00 | |
JobA | 17:35:00 |
JobB | 27:35:00 |
JobC | 17:35:00 |
Which is different from doing the avg(starttime), which gets the following:
JobId | avg(starttime) |
---|---|
10:27:30 | |
JobA | 08:47:30 |
JobB | 13:47:30 |
JobC | 08:47:30 |
This is because of the number of fields of the count(JobId) when doing the aggregate:
JobId | sum(starttime) | Aggr Avg | Count Aggr | Normal Avg | Normal Count |
---|---|---|---|---|---|
62:45:00 | 20:55:00 | 3 | 10:27:30 | 8 | |
JobA | 17:35:00 | 17:35:00 | 1 | 08:47:30 | 3 |
JobB | 27:35:00 | 27:35:00 | 1 | 13:47:30 | 2 |
JobC | 17:35:00 | 17:35:00 | 1 | 08:47:30 | 3 |
So the aggr will exclude the null or blank values, since it cannot do the sum on them.
Hope its a little clearer now.
Felipe.