Qlik Community

QlikView Documents

Documents for QlikView related information.

6 ways to Test your QlikView application

IAMDV
Honored Contributor II

6 ways to Test your QlikView application

67_Bad_Code.gif

“Being somewhat right is better than precisely wrong!”

This might be okay in broad spectrum of things in life. But this is no good when you are dealing with the data. As a QlikView expert, you have the responsibility to get things right every time you present the data to the users. No doubt that QlikView is innovative platform to develop data discovery applications. But it’s Developer’s job to present the data accurately because people make decisions and draw conclusions based on your application. And last thing you wanted is going back to your manager saying you are presenting incorrect numbers in the application or receiving an email/call from the business user who challenges you that the data is incorrect.

“Don’t jump the gun!”

It’s more important to present the data accurately while compared to developing quickly. It gets tricky because QlikView is very optimistic/ Developer friendly tool. You might get results but those results are answers to wrong questions you might have asked. So make sure to perform data integrity check before deploying the application. Don’t believe in Developers who say – their code is good without peforming any data integrity checks. And to perform these integrity checks, we don’t have to rely on Testers as it’s simple to perform these checks ourselves.

6 things to check before deploying the application:

1. Perform basic aggregations on the main Fact Table: I'm sure that most of you use incremental load with the dynamic "Where" clause. In this process, we might have a logical error while building the "Where" clause. To mitigate this potential bug, we need to check the row counts and sum total of the measure fields by comparing with the underlying source table. This way you will always know that you have extracted the full data set from the underlying source. This will be first check and for further assistance you can use system fields like $Field, $Table and $Row etc.

2. No aggregation on Key Fields: Key fields in QlikView must always be used only as a Join keys. Make sure you are using “HidePrefix” keyword along with the SET variable statement and use the same special character/ symbol as the prefix while assigning the field name (Example: %CustomerKey instead of [Customer Key]). With this approach, either you or other developers will not use this field accidentally as part of the UI design (You will see this field if the “Show System Fields” is checked). It’s also very important to note that you don’t perform any calculations on this field and remember not to use the key field as the chart dimension. Instead, you can duplicate the same field with the different name.

3. Check for Information Density and Subset Ratio: Always perform high level integrity check on your data model. You can see Information Density and Subset Ratio properties in the Table Viewer (Ctrl + T) by hovering on the fields. Investigate wherever Information Density is less than 100% and inform the Architect about the potential issue(s) with the NULL values. I would always check for Subset Ratio whenever I perform a QlikView Join. This way you know how many key field distinct values are associated to other table.

Definitions of Information Density and Subset Ratio (Source – Reference Guide):

    • Information Density is the number of records that have values (i.e. not NULL) in this field as compared to the total number of records in the table.
    • Subset ratio is the number of distinct values of this field found in this table as compared to the total number of distinct values of this field (that is other tables as well).

4. Check for connection strings in the QlikView script: Logical bugs are very difficult to identify. Generally, you might need to extract the data from more than one source. And sometimes you need to extract the data from multiple environments where the underlying schema will be same. You will be extracting the data from Dev & QA which has same schema and table names but different data. In this case, it’s very hard to debug because everything is right about your query except that you are using old connection string. So make sure that you abstract the connection strings to excel file/ database to manage them from one central place.

5. No syntax error doesn't mean no logical error: Set Analysis is great feature in QlikView. It will be very useful to control, identify and modify the sub set of the data. Generally, if there is no syntax error then it doesn't mean there is no logical error. Syntax errors are your friend's enemies but logical errors are YOUR enemies. Logical errors are more dreadful than the syntax errors. It’s hard to identify the logical errors compared to the syntax errors. So make sure, you always check the Set Analysis expressions compared to SQL queries. Where set modifier is equivalent to SQL "where" clause and set operators are "relational" operators in SQL.

6. Check for intruders in the dimensional tables (AKA: NULL Values): As a rule, we shouldn't have null values in the dimensional tables. You would always expect that your dimension fields should have 100% information density but real world is different from the theory. So it’s important to keep an eye on the dimensional tables. Because it's equally important to know - what is missing compared to what is available!

Calibrate Consulting
Comments
carbal1952
Contributor II

Thanks.

I'll think in this stuff next & every time i write some code.

CB.

IAMDV
Honored Contributor II

Thank you CB.

vadimtsushko
Contributor III

Thanks Deepak.


I have a question regarding one point of your post

2. No aggregation on Key Fields:

...

It’s also very important to note that you don’t perform any calculations on this field and remember not to use the key field as the chart dimension.

What serve as substantiation of this rule? I’ve seen suggestion to evade `count()` on key fields as it has no definite results. On the other hand `count(distinct keyField)` seems to be perfectly valid construct. See: http://community.qlik.com/blogs/qlikviewdesignblog/2013/10/22/a-myth-about-countdistinct


Whats about sum aggregate on key field? Where are its pitfalls?

IAMDV
Honored Contributor II

Hi Vadim,

The military rule from the QV Developer training book is not to use them in calculations. However, they behave okay if you have your dimension from one table instead of two dimensions from different tables. You will see the NULL value in the dimension field where the association doesn't have a join on both the tables but you'll still the calculation on the key field. They do provide the results and most of the time it's expected results. But this might cause confusion to other Developers & Designers. So it's best to avoid perform calculations on the Key Fields.

Thanks,

DV

vadimtsushko
Contributor III

Thank you for clarification, Now I understand where this rule come from.

I hope posing a questions about military rule will not necessarily lead to summary execution

I'm still trying to grasp reason behind this rule. I've tried to model situation which you specified if I correctly understood it:

Two dimensions from different tables connected to same key field, one of dimensional table do not have all values presented in transactional table.

Luckily standard test script fit almost perfectly. So I've created test applications, inserted standard test script and modified it slightly. In one dimensional table removed couple of rows and added duplicate of key field `Num` into transactional and dimensional table.

Next I've tested three mode of using key fields or its duplicate:

  • Presenting in charts in pair with dimension from other table.
  • Using in aggregation. In that example - to show only first five rows based on values of Sum(Expression1). =Aggr(If(Rank(Sum(Expression1))<6,Num, 'Other nums'),Num) for example. And its counterparts with duplicate fields.
  • Using for Count() and Count(DISTINCT)

Some test of using Key fields in calculations and charts

As I read results of these tests:

  • Count() without Distinct qualifier gave inconsistent results
  • Count() with Distinct qualifier gave same results in all three case
  • In all other modes Key field and duplicate of Key field in Transactional table behave identically. But obviously additional key in transactional table lead to some penalty in memory consumption
  • Field duplicate in dimensional table which does not have rows for all available values of key in transactional table behave differently. BTW only with that mode I've get rows with Null values. And I guess theoretically aggregation like =Aggr(If(Rank(Sum(Expression1))<6,Num_InDimension, 'Other nums'),Num_InDimension) should be slower then aggregation on key field as it involve hope between tables.

Do I miss something in my speculations?

IAMDV
Honored Contributor II

Vadim - Sorry for the delay. I shall reply back soon. BTW - Nice addition to this post!

Cheers,

DV

Version history
Revision #:
1 of 1
Last update:
‎02-14-2014 10:33 AM
Updated by: