Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
shantanu73
Creator II
Creator II

What is Perfect Key & Information Density?

Hello Forums/Friends,

I am aware of synethic key, circular table, etc. But What is the Perfect Key & Information Density?

What is the difference between Perfect Key & Information Density?

Can please any body can explain?

Thanks in Advance

SRS

6 Replies
swuehl
MVP
MVP

Both are information labels displayed in the table viewer when hovering over a table.

From the HELP:

Information density is the number of records that have values (i.e. not NULL) in this field as compared to the total number of records in the table.

Not 100% sure what a Perfect Key (I personally don't care about this label) is, check this thread for a recent discussion:

http://community.qlik.com/message/230152#230152

shantanu73
Creator II
Creator II
Author

Hello Swuehl,

Thank for response. But as per your explaintion I had created the Inline table and checked. It is always showing Inforamtion Density 100%. Following is the script:

Invoices2:

LOAD

INLINE

[InvoiceID2, ClientID2
Inv001,CL0001
NULL,CL0002]

Can u pls explain me with example

Thanks in Advance

Shantanu


swuehl
MVP
MVP

If you write 'NULL' as a value in an inline table, this is not NULL, it's just a string.

Try like this

TEST1:

LOAD recno() as RecID,

     if(RAND() > 0.1, 1 ) as Value

AutoGenerate 100;

and check information density on field Value.

shantanu73
Creator II
Creator II
Author

Hello Swuehl

Thanks for Example.

Shantanu

Not applicable

From the Book QlikView 11 for Developers:

"The name of the field. Optionally, if the field is a key field, a qualifier is shown enclosed in square brackets. This qualifier indicates the following levels of key quality:

[Perfect Key] indicates that every row contains a key value, and that all of these key values are unique. At the same time, the field's subset ratio is 100 percent. This qualifier should be seen in dimension tables, where every key should uniquely identify a single record.

[Primary Key] indicates that all key values are unique, but not every row contains a key value or the field's subset ratio is less  than 100 percent

[Key] indicates that the key is not unique. This qualifier is usually seen in fact tables, where the same dimension value may be associated with many different facts.

Information density of the field, which indicates the percentage of rows that contain a non-null value.

Subset ratio, which shows the percentage of all distinct values for a field in the table compared to all the distinct values for that field in the entire data model. It is only relevant for key fields since they are present in multiple tables and do not all share the same value. Subset ratios can be used to easily spot problems in key field associations. For example, when the combined total of subset ratios for multiple tables is 100 percent, this may indicate that there are no matching keys between these tables.

Tags, which show the tags applied to the field. Some of these, such as tags that indicate if the field is a key field or tags indicating the data type of the field, are automatically generated. Other tags can be manually applied.

"

Anonymous
Not applicable

Perfect Key Vs Primary key:

The [Perfect Key] is a bit more: it is a [Primary Key] (unique) with no "orphans".

Let me clarify this with 2 examples:

Example 1)

InvoiceDetails:

LOAD *

INLINE [

InvoiceID, InvoiceLine, ProductID, Quantity, Amount

Inv001,1,PR01,10,50

Inv001,2,PR02,10,40

Inv002,1,PR01,30,150

Inv002,2,PR03,10,800

];

Invoices:

LOAD *

INLINE [

InvoiceID, ClientID

Inv001,CL0001

Inv002,CL0002

];

In this example, InvoiceID is [Key] in InvoiceDetails and [Perfect Key] in Invoices: it is unique and it has no orphans.

Example 2)

InvoiceDetails:

LOAD *

INLINE [

InvoiceID, InvoiceLine, ProductID, Quantity, Amount

Inv001,1,PR01,10,50

Inv001,2,PR02,10,40

Inv002,1,PR01,30,150

Inv002,2,PR03,10,800

Inv003,1,PR01,10,50

];

Invoices:

LOAD *

INLINE [

InvoiceID, ClientID

Inv001,CL0001

Inv002,CL0002

];

In this example, InvoiceID is [Key] in InvoiceDetails and [Primary Key] in Invoices. There is a value in InvoiceDetails, Inv003, that does not find a corresponding value in the Invoices table. This is called a "orphan". For this reason InvoiceID is not anymore perfect, but it is still primary, because it is unique.

If you copy-paste the example you can verify it.

Information density and Subset Ratio:

Information density of the field, which indicates the percentage of rows that contain a non-null value

Subset ratio, which shows the percentage of all distinct values for a field in the

table compared to all the distinct values for that field in the entire data model. It is

only relevant for key fields since they are present in multiple tables and do not all

share the same value.

Subset ratios can be used to easily spot problems in key field

  1. associations.

     For example, when the combined total of subset ratios for multiple

tables is 100 percent, this may indicate that there are no matching keys between

these tables.

REF: QlikView 11 For Developers..

Let me give you simple example

Sales:

Load * Inline

[

  Customer, Sales

  A, 100

  B, 200

  D, 300

];

Customer:

Load * Inline

[

  Customer

  A

  B

  C

  D

];

If you write above sample script and will check the Table (CTRL + T), you will find two tables.

Sales and Customers


On Sales Table, if you hover the mouse on Customer field, you can see the Subset Ratio is 75% because there is not sales data for Customer C.

Now if you change the script for table Customer like below..

Customer:

Load * Inline

[

  Customer

  A

  B

  C

  D

]

Where Exists (Customer);

If will not load the Customer C as there are no sales data for the same.

Now check the Subset Ratio. It will be 100%

Definitions of Information Density and Subset Ratio (Source – Reference Guide):

Information Density is the number of records that have values (i.e. not NULL) in this field as compared to the total number of records in the table.

Subset ratio is the number of distinct values of this field found in this table as compared to the total number of distinct values of this field (that is other tables as well).

BR,

Chinna