Good afternoon, I've been searching everywhere for the past two days without success and decided to post my question here. I'm building a data wharehouse for a course and I'm having some trouble. The facts are flights which have a time, airplane, origin_airport, dest_airport, origin_climate and dest_climate dimensions. |
I would approach the design of such a datamart a bit differently. All airports should be in one dimension, regardless of whether they're origin or destination. As you rightly said, you're struggling because an airport can be both at some point or for some flights. This dimension would then be role-playing on the Fact table. You'd have one field on the Fact that that connects to the AirportDim to get the Origin SK and another field on the Fact to get the Destination SK. In this way, you'd load all your airport dimension information in your single AirportDim, and reference it when you're loading the Fact table.
Of course you'd have an AirplaneDim and a TimeDim and DateDim (Dim - Dimension).
As for weather, depending on whether you have one or more observations (wind, temp, cloud cover etc...) you could choose to make this a degenerate dimension in the Fact table (meaning the weather metric is in the Fact table) or in a separate Dimension. It's best to seperate them in a dimension. Again, I'd create a WeatherDim that has for a business key the AirportName or ID, that way when I'm loading, I'm expiring the previous record for an airport and entering a new weather record for it. In effect, weather would be a Type 2 SCD and you can use the Talend component to do this. Similar to Origin and Dest Airport, I'd have 2 SK fields on the Fact table that point to the Origin weather and the DestinationWeather. If you're a fan of snowflaking, you could make the weather dim a snowflake of the AirportDim (I'm just pointing that out - snowflaking is typically frowned upon in the Kimball Methodology).