Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm working on a datamodel for a Datawarehouse that should meet the following major requirements :
As generic as possible (various datasources, even ones we are currently not thinking of J)
Easy to extend
Easy to support
High Performance (GUI and aggregation on facts) especially when dealing with very high amount of data (this is one of the most crucial points, please keep this always in mind when giving feedback)
Just a few words on the concept …
Objects will be defined by linking "objectclasses" to the objects. This is a bit like the associative data model approach :
Identifier Name
77 Flight BA1234
08 Heathrow Airport
32 12-Aug-1998
48 10:25am
12 arrived at
67 on
09 at
Links
Identifier Source Verb Target
74 77 12 08
03 74 67 32
64 03 09 48
As you see in the diagram, I'm trying a compromise between complexity/flexibility/performance. Hence I have choosen a not completely normalized snowflake model with splitted fact tables.
If you're giving feedback, please consider as well the "traditional" QlikView approach (Dimensions/LinkTable/Facts/Starschema) and If you see advantage of one over the other please let me know.
Thanks a ton ...
Thomas
nobody any oppinion ? is it complete nonsense ?
How often the document will be reloaded?
If you join everything in a big table the application performance will be great, but the reload performance will be poor.
If you keep this schema the application, prior the charts, could take too long to calculate, but the reload process will be faster.
best regards,
Fernando D'Agosto
ThomasWRieck wrote:nobody any oppinion ? is it complete nonsense ?
I wrote up a long opinion days ago before deciding not to post it and deleting it.
Basically, if I understand what you're thinking, I suspect you're making a mistake. Two business systems that I wrote use generic data like this. It made sense for the business systems. As you said, easy to extend, easy to support, and high performance, at least for those cases (and some would debate me on the "high performance" part). I have so far only read data from one of these systems into QlikView, and when I did, I converted all of the "virtual fields" into real QlikView fields. I do not feel that a generic database is a good data model for QlikView, even if it was a good model for the source system. Our "data warehouse" of QVDs is made up of a whole lot of specific QVDs with specific fields, not generic data, or even abstracted data.
A substantial number of our source databases are at least partially-abstracted, with very general-sounding fields like "demand" "item" "consumer" "transfer method" and so on. In this case, I might store an order item in this abstract database, with demand = order item ID, item = product ID, consumer = customer ID, and transfer method = method of shipment to the customer. But when I load it into QlikView, I create a QVD specific to order items, and the fields get more concrete names like "order item", "product", "customer" and "shipment method". The abstraction goes away. Different types of objects stored in the same abstract database get split apart into multiple QVDs. I would not try to keep generic and abstract when moving the data into QlikView. I would make everything very specific, which is the opposite of what I think you are thinking of doing.
The main reason I didn't post it is that I honestly have no idea what your requirements are, or what you're thinking of doing. I don't feel confortable telling you what your data warehouse should look like based on a few small comments about how you want it to be, and my possible misinterpretation of those comments. Your shop may be completely different from my shop. If you take my advice, it might be a complete disaster, and your approach might have been much better. I simply don't have the information, or the time to learn the information, that would allow me to offer what I would consider useful advice.
Anyway, those are/were my thoughts. I probably should have just posted them the first time. At least it's better than no reply.