Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello all,
How information desity and subset ratio are more helpful in data modelling.
Thanks
you can find some good answers here
information density and subset ratio? What is the use of this?
Information density: says the % of non null values,
for a key field it should be 100%, because null values cannot be joined to null and something went wrong when creating key.
Sometimes it will be helpful to track that some key values are not in another table when joining two tables.
Subset ratio: It says you the how many distinct values in that field within a table with comparing to values in field of overall data model.
Ex: Product: ProID, ProName ---> It seems to be master, it has 200 product ids, Subset ratio is 100%
If it is below 100% then something went wrong with the mapping, because there are product ids that are exist in Order which are not in Product(Master) table.
Order: OrderID, ProID, Qnty, Price --> It has 300 records with 100 distinct product ids, Subset ratio is 50%
Information density of the field, which indicates the percentage of rows that contain a non-null value
Subset ratio, which shows the percentage of all distinct values for a field in the
table compared to all the distinct values for that field in the entire data model. It is
only relevant for key fields since they are present in multiple tables and do not all
share the same value.
Subset ratios can be used to easily spot problems in key field
associations.
For example, when the combined total of subset ratios for multiple
tables is 100 percent, this may indicate that there are no matching keys between
these tables.
REF: QlikView 11 For Developers..
Let me give you simple example
Sales:
Load * Inline
[
Customer, Sales
A, 100
B, 200
D, 300
];
Customer:
Load * Inline
[
Customer
A
B
C
D
];
If you write above sample script and will check the Table (CTRL + T), you will find two tables.
Sales and Customers
On Sales Table, if you hover the mouse on Customer field, you can see the Subset Ratio is 75% because there is not sales data for Customer C.
Now if you change the script for table Customer like below..
Customer:
Load * Inline
[
Customer
A
B
C
D
]
Where Exists (Customer);
If will not load the Customer C as there are no sales data for the same.
Now check the Subset Ratio. It will be 100%.
If subset ratio is less than 100%, the key is called as Primary Key
but for 100% it is called as Perfect Key
Hi Manish,
Thanks for the explanation.I just want to know,
Are Subset ratio (100% ) and information Density (100%) the only criterias for a key to become a Perfect Key ?
Is there any other feature of key that make it a Perfect key in addition to the subset ratio and information density being 100% ?
Thanks
Geeta
It should be... but not 100% sure... let some other to suggest on this...