Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
let's say i have
taskdim - taskid, taskName
opportunitydim - oppid, oppName
resourcedim - rsrcid, rsrcNm
taskfct - date, taskid, taskhours
oppfct - date, oppid, opphours
let's say tasks are in the past, a little in the future, and all opps are in the future. this data cannot be shoved into the same table, task and opps are on differently levels of granularity. now let's say i want to graph hours by day. i'm using a stacked column chart that is graphing hours by month. there is a task stack and on top of it there is an opp stack. no problems with that so far.
*then* i add an office field to each fact table, as well as a manager field. i want the user to be able to filter by office. if i name one officetask and the other officeopp then i would need 2 selectors in the front end to pick office - no go. however if i name them both office, they create a loop i don't like and the data seems to come out incorrect. please advise.
sickmint79 wrote:this data cannot be shoved into the same table, task and opps are on differently levels of granularity.
You could use a link table with a compound key, but why not just concatenate taskfct and oppfct into a single table? It may seem counter intuitive if you are a DB programmer, but it's a perfectly valid technique in QV.
-Rob
please elaborate on both solutions.
solution 1:
i am reading the first as i build a new table, say office dimension, and i put an id and the office name in it, and associate the id to the fact tables.
1. i kind of thought this was what qv was doing anyway
2. this still seems like i'd have the same problem?
solution 2:
i'm a very databasey guy, and still trying to get my head around how data is married in the qv world. please describe this solution further.
thanks!
This isn't answering your specific question, but I thought I'd expand a bit on how data modeling in QlikView can differ from data modeling for databases. Basically, the whole idea of data normalization falls apart in the QlikView world. QlikView works just fine with highly normalized data, and it was one of the things that quickly impressed me when I first started using it. I've never been a fan of reporting packages that require you to denormalize your data for reporting purposes. That said, QlikView will work equally well with highly denormalized data, and denormalization can sometimes be a better approach.
As a trivial example, let's say you have millions of individual sales each year, but of only five products, each of which has a rather long description. In the database world, you'd be best off giving each product a short ID, and having separate sales and product tables with the product ID connecting them.
You can do the exact same thing in the QlikView world, and it will connect everything together automatically and work great. However, you might be even better off just joining the long description into the main sales table when you load it in. Because QlikView's tables are read only after the load, your descriptions can't get corrupted and out of sync. And because of QlikView's compression, it won't take any more space than if you'd had a separate table linked by ID. The end result is a simpler model - one table instead of two.
yes john, that sounds logical. still having trouble with my problem though. easier for me to figure out in sql for sure! 🙂
OK, let's start with seeing if I understand your problem. See the attached file. Is that what you're talking about? I'm not seeing a loop, but I am seeing what is probably an undesirable synthetic key.
If that isn't what you're talking about, can you modify the example to be what you're currently looking at? It's usually a lot easier to solve problems when we have an example to play with. 🙂
Well, if that is your situation, then here's an example implementation of Rob's "why not just concatenate taskfct and oppfct into a single table" suggestion. There is a single fact table, and the task and opportunity names have been joined into it. However, I'm not seeing any serious problem resulting from the original approach, or any difference in results.
i'll start off with a couple side questions -
1. are all synthetic keys undesireable? i actually thought qlik was doing something faster or saving memory when it created one. like it said hey this set of data is always the same on these tables - rather than redundantly store the whole set, i'll store it once, give it an id, and have all the old locations just point to this set. should i be making the effort/is it normally achievable to remove all synthetic keys from a qv model? why is this synthetic key being made here, and how would you remove it?
2. regarding examples, i've got loads with production and sensitive data, is there any easy way to clean and cull it so as to provide an example? i made one in another thread, but it was all ground up.
to your example -
my original one was similar to this, although rather than office being on the fact table, i had office as an attribute of both opportunity and of task. if you name them task_office and opp_office you will have no problem, however if you name them both office you will see the loop i was talking about. i see how pushing it down to the fact layer actually closes that loop in your example, kind of like making an office dimension. this is the correct way to resolve this loop?
i've actually done part of what you've done already - i've already forced office down from the task dimension to the fact table. this was done because the task is unknown when the task is not a project. at the fact level i end up going to the resource table then and using the resource's office rather than the one i'd normally get from a project task, ie. "unknown" for hours charged to vacation and whatnot. i had not yet done this for opportunities. tried that out and the dashboard seems to be working as expected!
my model has however gone from 1 previously existing synthetic key to now 3. perhaps related to other parts i am working on as well.
Let me get a partial answer to you.
I may be wrong, but Idon't believe that all synthetic keys are undesirable. In some cases, QlikView is making synthetic keys exactly where I would make them myself. I'm guessing that in these cases, leaving the synthetic keys is the equivalent of me making the same tables. In the first example I posted, I don't think the synthetic keys are necessarily a problem.
There might be a better way, but in document properties, there's a Scrambling tab where you can irreversibly scramble the contents of one, several, or all of your fields. The structure will remain intact, but the data won't. That might be a way to post otherwise sensitive data.
What I would do, though, is just make a set of inline loads that demonstrate the problem in as simple a way as possible. Having customers A, B and C should be easier than a randomly scrambled customer name and 500 customers, for instance.
sickmint79 wrote:regarding examples, i've got loads with production and sensitive data, is there any easy way to clean and cull it so as to provide an example? i made one in another thread, but it was all ground up.
See this wiki page:
http://community.qlik.com/wikis/qlikview-wiki/preparing-examples-for-upload-reduction-and-data-scram...
-Rob