Questioning Approach on Date Dimension Table with Each Date Having a Row for All the Dates
I'm new in my role and have been given the task to update HR's star schema data model.
Since it is HR, their reporting is centered around people counts, hires, fires and turnover. This includes comparisons to previous months and years.
They do have a Date Dimension but there is also another date dimension that they use for comparing previous months/years. That DateCompare dimension is set up where each date has a comparison date, based on how far you want to go back/forward. There are also date rankings to help filter on the data as well as do reporting more easily.
So if you go back for three years, for each month end date in those three years, there will be a row with all of those dates. July 31 2021 will have 36 Rows, August 31 2021 will have 36 rows, etc. I think the attached file (a sample CSV in the ZIP) makes it easier to visualize. It is 36 months multiplied by 36 months = 1,296 rows. For a dimension that is not so bad, I guess.
The reason this table exists is to be able to do the calculations comparing points in time more easily and I understand that need when doing dynamic charts. It is working nicely for the department. However, I thought all of it can be done with calculations instead of this DateCompare dimension table. Also, I'm concerned that when this DateCompare dimension is connected to the fact table, you get duplicates because of the multiple rows. While these duplicate rows are automatically filtered out because the Date row has to be selected, it still seems like it is not the best approach.
Have you seen an approach like this before? If so, what are the pros/cons of it? Is there a different approach or is their way the better solution?