Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm trying to create a table that shows the average number of distinct employees that visit the gym per hour each day. A day only consists of 3 hours. My data consists of records that are created each time an employee swipes his badge, which looks like...
Sales_Person | Day | Hour |
---|---|---|
Frank | 1 | 2 |
Dana | 1 | 2 |
Dana | 1 | 2 |
Dana | 1 | 3 |
Dave | 1 | 3 |
Angela | 2 | 3 |
A straight table of distinct counts by hour and day would look like...
Hour | Day | Distinct Count |
---|---|---|
1 | 1 | 0 |
1 | 2 | 0 |
2 | 1 | 2 |
2 | 2 | 0 |
3 | 1 | 2 |
3 | 2 | 1 |
Note that the above table includes hour 1, even though nobody visited the gym during hour 1 on either day 1 or day 2. Then taking the average of the distinct counts over the Day dimension...
Hour | Avg Daily Distinct Count |
---|---|
1 | 0 |
2 | 1 |
3 | 1.5 |
This is the table I'm looking for. What dimensions, expressions, and other properties do I need to make this table?
Good exercise!
Try this one:
sum(
Aggr(
Count(DISTINCT Day),Sales_Person,Hour))
/
Count(DISTINCT TOTAL <Sales_Person> Day)
Hi,
Dimemsion will be Hour, Expression:
avg(
Aggr(DISTINCT
Count( Day),Sales_Person,Hour))
Please review attached.
Good luck!
Hi Mark,
I'm wondering whether my suggestion was helpful. If yes, please don't hesitate to mark it accordingly
Thanks!
Hi Boris,
Thank you for your response, but your suggestion and your example do not achieve the desired result. In the example you posted (with different data), the averages are not correct. Given the data you loaded, the final table should read:
Hour | avg( |
---|---|
1 | 0.3333 |
2 | 0.6667 |
3 | 1.6667 |
4 | 0 |
Take Hour 2, for instance. Your expression returns 1 as the average number of visits for Hour 2 when the actual average is 2/3. Think of it this way: On Day 1, there were 2 distinct visitors during Hour 2. On Day 2, there were 0 distinct visitors during Hour 2. On Day 3, there were 0 distinct visitors during Hour 2. The average number of visitors during Hour 2 over Days 1-3 is (2+0+0)/3 = 2/3.
I see... Try this one:
sum(
Aggr(DISTINCT
Count(Day),Sales_Person,Hour))
/
Count(DISTINCT TOTAL <Sales_Person> Day)
this gives you desired result
Thanks!
That's very close. There's just one more detail to work out. I tried adding a duplicate line to your inline table and the table calculations were off. Here's what I mean...
The line I added was
Dana, 1, 1
which is exactly like another record in the inline table. Since I only want to count visits by distinct employees in a given hour on a given day, this should not have changed the final average calculation. But if you try it, you can see that the calculation for Hour 1 changes from 1/3 to 2/3 because it's counting both of Dana's visits in the same hour and day. Do you know how we can fix this?
Good exercise!
Try this one:
sum(
Aggr(
Count(DISTINCT Day),Sales_Person,Hour))
/
Count(DISTINCT TOTAL <Sales_Person> Day)
Awesome job. Thanks Boris!