11 Replies Latest reply: Jun 13, 2011 3:48 AM by Jeroen Vuurens RSS

    Selecting fact table rows when a link table has been created

      Hi,

      in my QV 10 document, I have some dimensions and fact table and one link table. Each dimension key has a different name between dimension and fact table; the link table acts like a mapping table. I have created a list box to put the key values of a dimension, fe Year; Year in the fact table has another name and both names for Year are present in the link table. When I select an year of my list no rows are selected in the table box to show the measures. This issue doesn't occur when I use the year field of the fact table, but so in the year list box I haven't all possible values present in the year dimension table, that I'd like to have.

      Any idea to solve this issue? Thanks

        • Selecting fact table rows when a link table has been created

          It sounds like you link table is not connecting to the Fact table correctly, look into these things

          What field connects the Fact table to the link table, if this doen't involve the year (or a date) in some way you will not get a matching filter

          When you load the Year into the Link table is there something that is causing it to not load fully, e.g. are you joining it onto another incomplete set of data.

           

          Can you upload an example of what you are doing/.

           

          Jon

           

          • Selecting fact table rows when a link table has been created
            Oleg Troyansky

            It sounds like you over complicated your data structure. What's the reason to name join keys differently in the two tables? This way, you don't let QlikView use its associative logic and your solution will never be as simple and as fast as it could be...

            SInce you are asking questions in the "Getting started" forum, I assume you are relatively new to QlikView. In the case, you would be best off keeping things as simple as practical, without developing overly complex generic data models.

            just a thought...

              • Selecting fact table rows when a link table has been created

                Hi,

                I have added a link table to solve some circular references on my document. Without it I have one or more of these references. Searching on web, I have found that using link table could be a goog manner to solve this kind of issue, but in this way link table could affect the associative mechanism.

                Thanks

                  • Selecting fact table rows when a link table has been created
                    Oleg Troyansky

                    I think you misunderstood the meaning of the traditional LinkTable (or, perhaps you invented a new one?)

                    Check out this thread:

                    http://community.qlik.com/forums/t/15825.aspx

                    I'm sure there are many more posts where link tables are explained.

                    However, let me state that Link Tables typically solve problems with Synthetic keys. In your case, you mentioned Circular references (loops) - you might have to do some additional Data Modeling transformations to avoid those.

                    If you have multiple Facts - consider concatenating your facts into a single Fact table.

                    If you have repetitive attributes in the Dimensions, - try to eliminate duplication of the same data, or rename those particular fields that are causing loops.

                    Hard to diagnose without knowing the full picture...

                      • Selecting fact table rows when a link table has been created

                        Hi Oleg,

                        I'm working with QV from some weeks ago and I don't know QV very well. I have executed some proofs and I have read some articles and blogs.

                        A problem that I meet it is to solve possible circular references, because QV isn't a classic BI tool and it creates the associations by naming. This mechanism requires some warnings. After some proofs executed, I'm searching some best practises to prevent the circular reference and for me it is preferrable to prevent them that solve them when they occurs. Moreover, it can occur to use QV on a pre-built datawarehouse.

                        However, about circular reference, synthetic keys and link table I have found these links:

                        http://www.quickqlearqool.nl/?p=910

                        http://community.qlik.com/media/p/76744.aspx

                        A best practice could answer to this question: does QV work better on a star schema datawarehouse or snow-flake one? In my experience, I can have two dimensions linked between them and one of this linked to the fact table. Think to Product and ProductCategories dimensions or Companies and Departments dimensions.

                        You say that it is good to link more fact tables in one. So, is it a best practice for QV to manage only one fact table rather than two or more ones? Other BI tool can manage more fact tables without problem and why do I change in QV a pre-built dwh with two or more fact tables? It couldn't make sense to link Orders with Sales and with Purchases.

                        About synthetic keys, they are produced by QV. It's normal into a dwh to have groups of fields. Think to Products and ProductCategories: ProductID is linked to the Orders fact table, ProductCategoriesID is linked to Products dimension. This link is normal in a snow-flake schema and classic BI tools manage this case without problem, while QV creates a synthetic key for ProductID plus ProductCategoryID: ProductCategoryID isn't simply an attribute for Products but it's the key of ProductCategories. Think to Time or Calendar dimensions: Year and Month could be the composite key for these dimensions and also Year and Month could be a part of composite key of a fact table, fe Budget or Forecast (generally a budget is inserted or calculated monthly). Think to a geographic dimension with the composite key country and region and city: in a snow flake schema, Cities could be linked directly to a fact tables.

                        I have indicated common questions about dwh and BI/Olap tools and I think that QV has to be able to manage these common cases. So I think that to learn better this innovative tool it isn't the better way to execute more and more attempts and solve case after case. QV introduces a new BI approach and also new (but common) problems to solve. One of these common issues it is the circular reference. Ultimately, for QV there will be some best practises to prevent or avoid common QV problems, like circular reference and synthetic keys, but maintaning the benefits of the associative mechanism. So my case it is one of the issues above said and I think to post the data model figure it is less important to know the best practices for the common QV problems.

                        Many thanks

                          • Selecting fact table rows when a link table has been created
                            Oleg Troyansky

                            QlikView works in a different way compared to traditional DW/OLAP tools, and the comparison is very much favorable to QlikView - we can do things faster, better and more flexible than those traditional tools. The only thing - we need to do things differently with QlikView, to help it perform better in memory.

                            Regarding your questions - QlikView can work with snowflakes, however start schema performs better. Multiple fact tables are not a problem, unless they share multiple identical keys and therefore cause synthetic keys and circular references. In those cases, concatenating (not joining and not linking, but concatenating) multiple fact tables into a single fact table is one of best practices. It looks illogical, but if you think about it long enough, you'll realize that it's not that bad, and it's been known to perform better. Most circular references can be solved this way.

                              • Selecting fact table rows when a link table has been created

                                Hi Oleg,

                                you confirm my mind about QV best practices:

                                1) managing star schema dwh is better than show-flake schema one. When it is possible, it transforms a show-flake schema into a star one;

                                2) working with an unique fact table is better than working with two or more fact tables. When more fact table it is encountered, concatenating them.

                                Unfortunately, these best practices are little documented and this aspect isn't a PRO for a new and innovative BI tool. To learn better this tool it is very important to know them before the development. F.e, for me a circular reference it isn't a mere performance problem.

                                However, or QV or a traditional BI tool both must manage well the common issues of the BI/dwh area like the natural hiearchies (time, geography, ...).

                                Thanks

                                  • Selecting fact table rows when a link table has been created
                                    Oleg Troyansky

                                    Your conclusions are correct in general, but might be taking things too far in details:

                                    - While star schema is easier to manage in memory, no one ever proclaimed it to be a "best practice". I'm being very pragmatic about those issues. When your data is very large, you have to be very efficient and you have to take a few extra steps to button up all your tables, to squeeze all the possible performance you can. On the other hand, if your data is average in size, you might as well save time and money for your client or your employer and create an "OK" data model that will do the job perfectly and cost just a fraction of the "best of the best" model. After all, our solutions are service a certain need and then eventually be disposed of. Not always you have to work hard to gain 100% perfection - it costs too much.

                                    - Same consideration goes to Star vs. Snowflake. Concatenating tables is not always the remedy. This is where Data Modeling becomes more of an art and less of a science. You may ask 5 people and get 5 correct answers about resolving certain data model issues. There is not a single "book answer".

                                    - When you are talking about QlikView associative database, please don't call it "dwh" - you are insulting so many people on this forum :-)

                                     


                                    pscorca69 wrote:Unfortunately, these best practices are little documented and this aspect isn't a PRO for a new and innovative BI tool.


                                    This sounds to me as a misnomer. Most "new and innovative" tools lack extensive documentation - or they wouldn't be "new and innovative". And still, there is a discussion in the Reference Manual about avoiding Synthetic Keys and Circular References. It's also included in the standard Developer II training, for those who cares to get trained.

                                    So, I can't quite share your criticism...

                                      • Selecting fact table rows when a link table has been created

                                        Hi Oleg,

                                        I'm learning and evaluating QV to propose solutions with a good BI tool. QV is reached the version number 10. In 2011 it is a LEADER in the Gartner Magic Quadrant for the Business Intelligence Platform. When a product reachs this goal should be more professional. An aspect to evaluate positively a tool is as it's supported by official articles, tutorials, references, manuals and official community. So, a beginner finds the product best practices after proofs or asking into the community: fe, does it exist a good article on the official site to talk about these best practices? You have given me your suggests, you aren't agree with link table use that I have found on QV blog. A person that must learn needs possibly an unique official voice to an important subject like the development best practices before starting with the development.

                                        I'm sorry if I have insulted you and other one in this community, but I have referred to the BI/dwh area to mention the related general concepts and issues. The area is common between the associative tool as QV and the other one. Moreover I have referred to the db that I read, structured as a datawarehouse.

                                        For me, QV isn't a fine girl to kiss but simply a good tool to work!!! And so this post can be closed.

                                        Many Thanks

                          • Selecting fact table rows when a link table has been created

                            Hi pscorca,

                             

                            If I read your original question correct, I think you have 1 fact table and loops to dimensions. In that case, neither link tables or concatenation will be what you are looking for. You also state that YEAR has a different name in the dimension and fact table. If you were to use link tables, you typically create a synthetic key between a fact table and a link table.

                             

                            For you actual question, I'm assuming 2 possible scenario's regarding a loop with YEAR:

                            - there is a loop because you point to two fields in your time dimension from the same fact

                            - there is a loop because you use year twice from the same fact (like startyear, endyear?)

                             

                            Like I said, in either way, link tables won't help you. If I understand your problem correct, there are two possible solutions:

                            - create a seperate YEAR dimension for the second use, with a different name (related to it's meaning?). There is usually no need to have both references point to the same dimension.

                            - derive a measure from the difference in year. You would keep reference to your time dimension as f.i. the startyear and a measure years_passed, enabling you to reconstruct the endyear if needed.

                             

                            if I missed the point you really have to be more clear on the structure you are trying to implement.

                             

                            good luck, Jeroen