6 Replies Latest reply: Nov 4, 2014 12:26 AM by chinna katiki RSS

    What is Perfect Key & Information Density?

    Shantanu Sardar

      Hello Forums/Friends,

       

      I am aware of synethic key, circular table, etc. But What is the Perfect Key & Information Density?

      What is the difference between Perfect Key & Information Density?

       

      Can please any body can explain?

       

      Thanks in Advance

       

      SRS

        • Re: What is Perfect Key & Information Density?
          Stefan Wühl

          Both are information labels displayed in the table viewer when hovering over a table.

           

          From the HELP:

          Information density is the number of records that have values (i.e. not NULL) in this field as compared to the total number of records in the table.

           

          Not 100% sure what a Perfect Key (I personally don't care about this label) is, check this thread for a recent discussion:

          http://community.qlik.com/message/230152#230152

          • Re: What is Perfect Key & Information Density?

            From the Book QlikView 11 for Developers:

             

            "The name of the field. Optionally, if the field is a key field, a qualifier is shown enclosed in square brackets. This qualifier indicates the following levels of key quality:

             

            [Perfect Key] indicates that every row contains a key value, and that all of these key values are unique. At the same time, the field's subset ratio is 100 percent. This qualifier should be seen in dimension tables, where every key should uniquely identify a single record.

             

            [Primary Key] indicates that all key values are unique, but not every row contains a key value or the field's subset ratio is less  than 100 percent

             

            [Key] indicates that the key is not unique. This qualifier is usually seen in fact tables, where the same dimension value may be associated with many different facts.

             

            Information density of the field, which indicates the percentage of rows that contain a non-null value.

             

            Subset ratio, which shows the percentage of all distinct values for a field in the table compared to all the distinct values for that field in the entire data model. It is only relevant for key fields since they are present in multiple tables and do not all share the same value. Subset ratios can be used to easily spot problems in key field associations. For example, when the combined total of subset ratios for multiple tables is 100 percent, this may indicate that there are no matching keys between these tables.

             

            Tags, which show the tags applied to the field. Some of these, such as tags that indicate if the field is a key field or tags indicating the data type of the field, are automatically generated. Other tags can be manually applied.

            "

            • Re: What is Perfect Key & Information Density?
              chinna katiki

              Perfect Key Vs Primary key:

               

              The [Perfect Key] is a bit more: it is a [Primary Key] (unique) with no "orphans".

               

              Let me clarify this with 2 examples:

               

              Example 1)

               

              InvoiceDetails:

              LOAD *

              INLINE [

              InvoiceID, InvoiceLine, ProductID, Quantity, Amount

              Inv001,1,PR01,10,50

              Inv001,2,PR02,10,40

              Inv002,1,PR01,30,150

              Inv002,2,PR03,10,800

              ];

               

              Invoices:

              LOAD *

              INLINE [

              InvoiceID, ClientID

              Inv001,CL0001

              Inv002,CL0002

              ];

               

               

              In this example, InvoiceID is [Key] in InvoiceDetails and [Perfect Key] in Invoices: it is unique and it has no orphans.

               

               

              Example 2)

               

              InvoiceDetails:

              LOAD *

              INLINE [

              InvoiceID, InvoiceLine, ProductID, Quantity, Amount

              Inv001,1,PR01,10,50

              Inv001,2,PR02,10,40

              Inv002,1,PR01,30,150

              Inv002,2,PR03,10,800

              Inv003,1,PR01,10,50

              ];

               

               

              Invoices:

              LOAD *

              INLINE [

              InvoiceID, ClientID

              Inv001,CL0001

              Inv002,CL0002

              ];

               

               

              In this example, InvoiceID is [Key] in InvoiceDetails and [Primary Key] in Invoices. There is a value in InvoiceDetails, Inv003, that does not find a corresponding value in the Invoices table. This is called a "orphan". For this reason InvoiceID is not anymore perfect, but it is still primary, because it is unique.

               

              If you copy-paste the example you can verify it.

               

               

              Information density and Subset Ratio:

              Information density of the field, which indicates the percentage of rows that contain a non-null value

               

              Subset ratio, which shows the percentage of all distinct values for a field in the

              table compared to all the distinct values for that field in the entire data model. It is

              only relevant for key fields since they are present in multiple tables and do not all

              share the same value.

              Subset ratios can be used to easily spot problems in key field

              1. associations.

                   For example, when the combined total of subset ratios for multiple

              tables is 100 percent, this may indicate that there are no matching keys between

              these tables.

               

              REF: QlikView 11 For Developers..

               

              Let me give you simple example

               

              Sales:

              Load * Inline

              [

                Customer, Sales

                A, 100

                B, 200

                D, 300

              ];

               

              Customer:

              Load * Inline

              [

                Customer

                A

                B

                C

                D

              ];

               

              If you write above sample script and will check the Table (CTRL + T), you will find two tables.

              Sales and Customers


              On Sales Table, if you hover the mouse on Customer field, you can see the Subset Ratio is 75% because there is not sales data for Customer C.

               

              Now if you change the script for table Customer like below..

              Customer:

              Load * Inline

              [

                Customer

                A

                B

                C

                D

              ]

              Where Exists (Customer);

               

              If will not load the Customer C as there are no sales data for the same.

              Now check the Subset Ratio. It will be 100%

              Definitions of Information Density and Subset Ratio (Source – Reference Guide):

              Information Density is the number of records that have values (i.e. not NULL) in this field as compared to the total number of records in the table.

              Subset ratio is the number of distinct values of this field found in this table as compared to the total number of distinct values of this field (that is other tables as well).

               

               

              BR,

              Chinna