Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Dear All,
I've been working with a large fact (22m) with a number (7) of dates.
The canonical date structure has been working brilliantly for me, however I now need to add a second fact.
This fact needs to be linked to the original fact, and also needs to be linked to the master calendar in its own right.
A picture speaks a thousand words so below is my current DM. Missing from the picture is the fact that %job_id = JobEntryID so the two FACTS can be linked.
This works OK for showing a summary level for a date i.e. I can pick a date (in this case the last month) and get my job info and my timesheet info.
However, what I can't do is select say a regional director and see the timesheet values.
Thinking about it a little more, I also need to make sure that I don't limit one way or the other, so for example a job might have its 7 dates within April but some work is done in May. If I was looking at my May report, I wouldn't expect the job to show up in the counters but I would expect my timsheet detail to
Is there any way to work this new fact into the existing canonical date structure?
I am guessing I can't join the tables as %job_id is the unique ID in the jobs_fact, but this is a one to many with the timesheet fact, but also sometimes a one to none.
I know I could scrap this method and build multiple calendars and link to a floating master but that means re-working all my expressions.
Any suggestions on the best way to approach this?
Hmm, in that case you could concatenate all the fact bits to the timesheet_fact table (and perhaps rename that table) and keep the other fields in a jobs dimension table. That way you won't get funny numbers when you sum a jobs fact which would happen if you joined the tables instead of concatenating them.
I am guessing I can't join the tables as %job_id is the unique ID in the jobs_fact, but this is a one to many with the timesheet fact, but also sometimes a one to none.
Not necessarily a problem. As long as it's not an n-to-m relationship you should be able to join the two tables. Just don't make it a left, right or inner join so as not to lose any data.
So a straighforward outer join to return both full datasets
Am I just thinking about it in too 'SQLy' terms, somehow I am imagining huge duplications of my job fact...
Weirdly in 6 years I've very rarely joined facts
What makes the jobs_fact table in fact table? I can't see all its fields, but what I do see doesn't look much like facts to me.
Sorry its a cut down one, there are a whole load of value fields
Obviously the dimensions in the table could be easily pulled and I could use a link table to join on the job_id and then individual calendars, which is where my mind was going
Hmm, in that case you could concatenate all the fact bits to the timesheet_fact table (and perhaps rename that table) and keep the other fields in a jobs dimension table. That way you won't get funny numbers when you sum a jobs fact which would happen if you joined the tables instead of concatenating them.
OK that is what I imagined would happen if I joined in that way so at least my logic was correct.
That makes sense actually and my brain just didn't quite make that leap as I was fixated on the dates... I like it, I will experiment thanks gwassenaar
As always now I am down the rabbithole it appears there is a n-m possibility so back to the drawing board on this one as I'll need to link at a higher level
However the concat would have been the correct example so I will mark it such