13 Replies Latest reply: Feb 6, 2014 8:25 AM by Thomas Jensen RSS

    flat table structur to star schema



      I wanna know what the best practices is to make a star schema in qlikview when your start point is one big flat table?


      Eg I have this table loaded into qlikview:




      When is the best practices to split this up in Dimensions and facts?


      Lets say i want to create a ShopDimension And a CustomerDimension and a DateDimension


      Would i then resident load from the big table and just make a surrogate key? Or what do you mean is the best?

        • Re: flat table structur to star schema

          Hi Thomas


          Actually, your flat table structure might work better than star schema in QlikView. Qlikview, when it creates the tables in its memory via the script actually stores information in each table as bits and lookups - this is one of the ways it can reduce the size of the data. See this article here for more information.




          Creating an id code and a lookup for an text field may actually use more memory than just leaving the text field in as QV has to create a "lookup" for the text, and then an additional "lookup" for the Code rather than just the one.


          Because of this, often calculations work quicker if the components are all in the same table, rather reaching across lookup tables.


          However obviously this depends on the rest of the data loaded, if you want to add more information to the lookups etc then this might not be the case.




          After loading in the big table, you can create a lookup by loading resident from it, and using DISTINCT to get the distinct values:


          LookupTable: Load distinct ShopID,ShopName resident BigTable;


          This should be quicker than loading from source, as Qlikview will already have it in its memory.



          • Re: flat table structur to star schema
            Peter Cammaert

            I subscribe to Erica's first suggestion. Leave everything in this one big table, and your performance will be excellent because no associative links need to be traced. Dimensions will operate just the same, whether in a separate table or embedded in the Facts table.


            Only one exception: you may want to create a Master Calendar for your CreateDate field, as there may be holes in that field.


            But this is only a starting point, you were saying?





              • Re: flat table structur to star schema

                My table consists of 1.7 billions rows of transactions.


                My createdate should be my key to the my Mastercalendar (Maybe ill create a new key in the table without timestamp and so on) - What do you think the best solution would be?


                And yes this is only a start point. This one big table contains most of the data needed, but later on there can be a new table that might need to be joined on or something else.

                  • Re: flat table structur to star schema

                    Master calendars are great ideas and there is  alot of guidance on the community.


                    Also check out the autonumber() functions when you are creating keys for this as this is a compact way to create a key in QV

                      • Re: flat table structur to star schema

                        Yes - The calendar part i got covered


                        Ill sure have to look into how i do lookups and how i get the key in my "bigfacttable" instead of the real value.


                        maybe a load of the big table


                        distinct some values into a dimension


                        make lookup from dimension and big table and create a new "big table" and then drop the first 1 so that the new big table only consists of maybe values and keys.

                      • Re: flat table structur to star schema
                        Peter Cammaert



                        To make sure that your end-users still say hello to you in the morning, consider choosing from the following:

                        • Aggregate whatever can be aggregated, and drop the historical records that nobody needs, or
                        • Take an opportunistic view to the data model, and use whatever gives you acceptable performance. For instance, do a comparison by testing a single table-model against a multiple table model with say 10 mio rows.


                        Indeed, a standard Master Calendar can be built and coupled to the transaction table by loading Floor([CreateDate]) AS CreateDate. If the design doesn't need a day level calendar, reduce granularity to YearMonth or YearQuarter.



                          • Re: flat table structur to star schema

                            Haha - We have discussed level of aggregation and sadly we need every thing on transactionlevel.


                            But good idea to try and split the data into large segments to see whether it would perform better. Maybe even split it out to serverel qvw documents? and then combine it into a qvd load. However i only need to do full load once, the rest can be incremental load.