I still haven't been able to work out the correct model to make this work, but I can see that if required, the storage date is definitely separate from the transaction dates and timesheet hours. However, the timesheet hours are what allows the transactions to happen and as such, the transaction hours and the timesheet hours need to be linked.
I just cannot see how to do this and I suspect that once one ofteh QV model wizards has a quick peek, they will be able to point me in the right direction . . . and I suspect I could move all of the dates, including storage dates, onto a master calendar.
Any stears people?
My suggestion is that you take a look at the model and see if you can reshape it to a star or snowflake schema. Generally you are very likely to end up with trouble when you have a chained structure like this. Consider if you have a performing calculations involving a fields in Site_Details and Worked_In_Dept, this means that QlikView needs to find alogical path between the fields in order to evaluate the relations. In this case the realtion goes over 7 tables, which is very much. If there is no logical linking between the values of your selection between the tables, then QlikView will combine all values leading to incprrect results in your application.
I also noticed that you have common values in tables without having them linked, for example PRIN_CODE and SITE_CODE. To me it seems like these should be gathered in one table and from there linked by a key field to table where required. The same goes with dates, try to use one master calendar as much as possible to have a common calendar for your data.
Bit short on time, but hope this short entry can be of some help.
I understand the benefit of a star configuration, however, with limited experience to building them . . . this is my first QV (or BI for that matter) application, I am pretty much at a loss for where to start?
What is the key fact table? My understanding is that in a star (or snowflake) configuration there is a key fact table or tables . . . but how to build this without loops or a gazillion synthetic keys is my challenge. I have been looking at this for 3 straight days with not an iota of progress :-(
I can see at least 4 key fact table: TransDet, Timesheet_Hours, Stock_Ledger_Details and Storage_Details. Pretty much all of the remaining tables are all dimension tables of some kind. I had initially tried a single calendar and wound myself into such loops I took days to unwind myself.
Another problem that I can forsee, is that in TransDet, there is not exactly a 100% unique key - the application services a warehouse environment and it is feasible that the transaction could share date, prin_code, job_no, prod_code, site_code and loc_code . . . and I have a feeling that if that is the case I could face issues?
Clearly help of all and any kind would be greatly appreciated. I am guessing that I need to change the data model completely - which I am quite prepared to do - but I do need a little guidance here if possible.
you are jusmping in at the deep end with the model you have been posting about so you'd better equip yourself. Below is a very useful ressource that would be worth getting
Ralph Kimball: The Data Warehouse Toolkit, editor Wiley.
He describes simply and with many examples how to build star schemas. His aim was to have thsi done in RDBMS' with sql but his modeling principles can be applied to anything. I only did OLAP before getting on to qv and for both the lessons are fine. read the book and you'll be dreaming about star schemas
the link is to a 2nd edition which I can't vouch for as I have the original and haven't seen the second one.
I can't say how readeable, understandeable it is for a non IT/BI person but he walks you through lots of different examples and shows how they should get modeled.
Your starting point is ask yourself what are you trying to measure?
Then start rebuilding the thing brick by brick. Instead of trying to unwind your data model., load a new qvw with just one central (ie fact) table which contains what you are trying to measure. see how the dimensions (the things you are measuring against, ie sites, dates, locations ???) tie into this.
then take your second fact table. how does it relate to the first? what dimensions do they share? do they have the same granularity (ie does one look at days, the other at months? does one look at sites the others at countries?
start small aim big. keep your original model as maybe those are the charts you ultimately want to build but you may learn and progress more trying to rebuild the thing rather than trying to fix the original.
just my 2cts.
I will definitely look it up and buy it - sounds very good. I do in fact come from an IT background - 23 years of experience - mostly in ITIL and service roles . . . so this is a big change. But loving it and as I have extensive Excel skills, this is not too deep endish for me . . . well okay a little.
I did really sort of do as you suggested . . . I started with one of the core fact tables Timesheet_Hours and build it from there. But as it was my first application, I used my Excel skills and understanding of 1-NF tables from Uni 25 years ago . . . and hence the links and not the star.
I don't believe the model is too far off - I will happily rebuild from scratch if required, but I only have 1 month left on the contract. I would like management to start using it to get their feedback and make the changes and tweaks as required.
The whole application is about warehouse (and staff) efficiency. The two main fact tables are thus - TransDet and Timesheet_Hours. The other two fact tables will have links but are logically separate - one will look at space utilisation efficiency and the second at changes in charging mechanisms - separate but using some of the same dimension tables.
I guess I was hoping that someone would have the time to look at the model and make some succint suggestions.
P.S. The contract was initially to build this in Excel, but there are millions of rows of data and it was three weeks in and I already knew that it was not feasible in Excel. Hence my search and identification of QV - sharp learning curve - and happy that I have done it. It is a great product and a powerful tool and hopefully something I can take forward.
In your current data model, the transactions and time sheets are linked ONLY by principal code, not by anything else, including by date. If you want to avoid model changes to link dates together, you can create a date island, and use that as a dimension. Then it would look something like this:
=sum([TransDet TXN_DATE] =IslandDate,[TransDet TOTAL_VOL])
/sum([Timesheet_Hours Date]=IslandDate,[Timesheet_Hours Hours])
Performance will be poor, but I'd expect it to work. If it performs well enough for you, you're probably good to go.
As far as changing the model to link everything together, I don't want to attempt it without a better understanding of your data and your requirements, and I don't want to take the time to really understand your data and requirements. But I'd expect any new data model to bring the timesheet and transaction tables closer together, possibly combined, or just keys to the tables combined in a fact table where the fact table had your integrated date dimension. I don't know that this is correct, but it appears that the records on your stock ledger detail, storage details, trans det, and timesheet hours could all be considered "activities". You might, then have an activity table to combine them all, with an activity type field describing what that particular row is all about, and possibly keys to tables that expand on the data so that you don't have too many fields that are typically null in the central table. This is probably a very similar suggestion to Toni's, and would probably give you much more of a snowflake schema. No idea if it's really the right approach, though.