3 Replies Latest reply: Jun 5, 2017 7:42 AM by Shabaz Mohammed RSS

    Data Model

    prachi DHOK

      hello everyone please share what kind of challenges you faced during developing large Data Model and how to resolve it?

      which schema is the best for it?



      Thank You

        • Re: Data Model
          P M

          When a dimension table is snowflaked, the redundant many-to-one attributes are removed into separate dimension tables. For example, instead of collapsing hierarchical rollups such as brand and category into columns of a product dimension table, the attributes are stored in separate brand and category tables which are then linked to the product table. With snowflakes, the dimension tables are normalized to third normal form. A standard dimensional model often has 10 to 20 denormalized dimension tables surrounding the fact table in a single layer halo; this exact same data might easily be represented by 100 or more linked dimension tables in a snowflake schema.

          We generally encourage you to handle many-to-one hierarchical relationships in a single dimension table rather than snowflaking. Snowflakes may appear optimal to an experienced OLTP data modeler, but they’re suboptimal for DW/BI query performance. The linked snowflaked tables create complexity and confusion for users directly exposed to the table structures; even if users are buffered from the tables, snowflaking increases complexity for the optimizer which must link hundreds of tables together to resolve queries. Snowflakes also put burden on the ETL system to manage the keys linking the normalized tables which can become grossly complex when the linked hierarchical relationships are subject to change. While snowflaking may save some space by replacing repeated text strings with codes, the savings are negligible, especially in light of the price paid for the extra ETL burden and query complexity.








          • Re: Data Model
            Anil Kumar



            In general the star schema is the preferred model as it offers a good balance between the

            various trade-offs.

             In Qlikview also the preferred model is Star Schema, because during runtime there is only very

            less number of joins, excellent response time.

             Also QlikView doesn’t store the duplicate values in the file, it stores distinct values and uses

            pointers to refer when there is a duplicate, so there is no issue of space consumption.


            Hope it helps!!

            • Re: Data Model
              Shabaz Mohammed

              According to me, Dates are the biggest hurdle in data model.

              There will be scenarios where you'll have Dates in each dimension table and mapping correctly ti the Fact can be challenging.

              Specially when Dim Dates has multiple entry and fact date has only one entry. Which means the Fact is not populated with all the Date changes happening in other Dim but only has important records stored.

              I'm currently dealing with it...


              Sometimes you don't have primary key in a dim table and need to map it to fact.. it can also be challenging.

              It mostly depends on what kind of data you get.