subset ratio

dhavalvyas · ‎2017-11-27

Hi,

How to achieve 100% subset ratio for data model ? what are the tips for the same ?

Anil_Babu_Samineni · ‎2017-11-27

Excellent Example is there information density and subset ratio? What is the use of this?

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful

ysj · ‎2018-01-22

Please find below information:

Information density of the field, which indicates the percentage of rows that contain a non-null value

Subset ratio, which shows the percentage of all distinct values for a field in the

table compared to all the distinct values for that field in the entire data model. It is

only relevant for key fields since they are present in multiple tables and do not all

share the same value.

Subset ratios can be used to easily spot problems in key field

associations.

For example, when the combined total of subset ratios for multiple

tables is 100 percent, this may indicate that there are no matching keys between

these tables.

REF: QlikView 11 For Developers..

Let me give you simple example

Sales:

Load * Inline

[

Customer, Sales

A, 100

B, 200

D, 300

];

Customer:

Load * Inline

[

Customer

A

B

C

D

];

If you write above sample script and will check the Table (CTRL + T), you will find two tables.

Sales and Customers

On Sales Table, if you hover the mouse on Customer field, you can see the Subset Ratio is 75% because there is not sales data for Customer C.

Now if you change the script for table Customer like below..

Customer:

Load * Inline

[

Customer

A

B

C

D

]

Where Exists (Customer);

If will not load the Customer C as there are no sales data for the same.

Now check the Subset Ratio. It will be 100%.

If subset ratio is less than 100%, the key is called as Primary Key

but for 100% it is called as Perfect Key.

Information density:

country	code
Afghanistan	AF
Albania
Algeria	DZ

After loading the above records, go to the table viewer section (CTRL+t)

mouse over on "country" field, you will see Information density is 100%

Now mouse over on "code" field, you will see Information density is less than 100% i.e 67%

It means that, code field contains a null value in 2nd row that's why it is showing Information density is 67%

Check for Information Density and Subset Ratio: Always perform high level integrity check on your data model. You can see Information Density and Subset Ratio properties in the Table Viewer (Ctrl + T) by hovering on the fields. Investigate wherever Information Density is less than 100% and inform the Architect about the potential issue(s) with the NULL values. I would always check for Subset Ratio whenever I perform a QlikView Join. This way you know how many key field distinct values are associated to other table.

Definitions of Information Density and Subset Ratio (Source – Reference Guide):

Information Density is the number of records that have values (i.e. not NULL) in this field as compared to the total number of records in the table.
Subset ratio is the number of distinct values of this field found in this table as compared to the total number of distinct values of this field (that is other tables as well).
- Information density on keyfields should be 100%, meaning there are no records with a blank in this field
- - Subset ratio should also ideally be 100%. If it is less, there might be records in the data_model which cannot be properly linked to.
- - Keys should always be "Primary" or "Perfect", both of which means the keyfield uniquely identifies each single record.
- <=> One thing that is at the root of all trouble in our QlikView_environment is that there are maybe two dozen tables we draw from a database and we need to use a nr. of different keys because, quite often, two or three tables can be linked using one keyfield - but in another table, that field is not present.
- I know that is a fundamental issue in database_design and whoever made that database should have thought of it, but that doesn't help us ...
- The issue is, none of the original designers of the database we use is here anymore, there is no documentation whatsoever and no one knows how to approach it for any changes 😉

---

hose things we can identify from ur data model table,those are created automatically based on ur data source (information density) &how u define ur keys(subset ration).

How do we test the datamodel with help of subset ratio and information density.---there is no way to test,but u can find how much correctly u define ur keys (but it not always right,means don't expect 100% ratio as correct) from data model by looking those percentage

Related Topics