7 Replies Latest reply: Jan 23, 2014 9:23 AM by Henric Cronström RSS

    Possible alternative to concatenate, or link tables?

      I am reading about using concatenation and link tables to resolve circular references associated with a multi-fact QV data model.  As a star schema data modeler, I have always created “logical stars”, consisting of separate fact tables for each business process (Ex: Sales, Purchasing, Customer Service, Inventory, etc.)  I struggle with mixing apples and oranges in to a single table (concatenation).  I also struggle with introducing the possibility of dirty data by "manufacturing data" (in ETL), for all combinations of facts (fact link table).  I am used to other metadata based tools like OBIEE or Business Objects , which use “alias” column names to resolve circular references. 

       

      I have demonstrated this in the below QV data model below.  Doing it this way allows multiple fact stars in the same data model with no circular references, and a “real” star schema.

       

      Does anyone see any reason my logic is flawed?

       

      -------------------------------------------------------------------------------

        

      Dimensional Modeling (star schema) for QV data model :

       

      No need to concatenate fact tables, or use link tables to avoid circular references.  Also, xxx_id "foreign keys" are necessary in the facts to allow for slowly changing dimensions.

         

      FactSales:

       

      %sales_product_id,

      %sales_period_id,

      %Promotion_id,

      sales_amount,

      --------------------------


      FactPurchases:

       

      %purchase_product_id,

      %purchase_period_id,

      purchase_amount,

      -----------------------------------

        

      DimProduct:


      %sales_product_id,

      %purchase_product_id,    (same value as %sales_product_id)

      product_code,                     (business key, unique by start_date)

      product_name,                   (SCD Type 2)

      category,                              (SCD Type2)

      product_cost                     (SCD Type2)

      start_date,

      end_date,

      IsCurrentRecordFlag      (set to Yes for only one distinct product_code)

      ...

      --------------

       

      DimCalendar:

       

      %periodID,  (AutoNumber() generated key, used in set analysis YTD, LYTD, etc)

      %sales_period_id  (same value as %PeriodID)

      %purchase_period_id (same value as %PeriodID)

      QuarterID,  (used in set analysis)

      Date,

      Year,

      Month,

      MonthYear,

      Quarter,

      YearWeek

      ...

      ---------------------

       

      SalesPromotion:

       

      %Promotion_id,

      Promotion_name,

      promotion_start

      promotion_end

      ...

        • Re: I see no need to ever use concatenate, or to use link tables..
          Steve Dark

          Hi David,

           

          There are a couple of reasons why I tend to opt for bringing as many dimensions as possible into my main fact table, rather than going for the star or snowflake approach.

           

          The first is one of performance, as users click around your QlikView app all the charts and tables that are visible will be recalculated.  QlikView has to ensure it has all the data required to do that recalculation to hand and associated.  It takes slightly more processing to do this if there are more tables in play.

           

          Another reason I like bringing everything together at the QVD generation stage is that you create a smaller number of more useful QVD files.  If you want a number of developers to all build on the same data layer then you will need to do less explaining if all the dimensions those developers need are in a single file.

           

          Regarding concatenation, I would again advocate this as an approach.  As well as performance here; selections can become a lot simpler if tables are concatenated.  A prime example is an Actuals file and a Budget file.  If you try and join the two on a composite key of Month / Office / SalesPerson (or whatever) then if there are rows in the budget that don't exist in the acutals selections don't work - that is true even when your actuals go up to today and your budget is for the whole year.  If you concatenate the budget onto your actuals any columns that are the same can be selected on (Month, Product Type, Office) and both acutal and budget rows will be selected.  If you pick a dimension in the actuals that doesn't exist in the budget (perhaps Date or SKU) then the budget numbers are excluded - but then a comparison of a single SKU against the whole months budget would not make sense anyway.

           

          With actuals and budgets in separate tables with a join key between them selections on fields can cause a fair bit of weirdness to take place.  Put them in one table and everything is quite a bit simpler.  The same is also true of other combinations of files, such as purchases and sales files.

           

          When concatenating tables in this way you do need to keep an eye on whether you are keeping loads from QVD's optimised (at least for large tables) and there are approaches that can help ensure this.

           

          These sort of performance best practices are exactly the sort of thing I often blog about over on the Quick Intelligence blog.

           

          Cheers,

          Steve

           

          Quick Intelligence