Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
mbespartochnyy
Creator III
Creator III

Why is Set Analysis on Key Fields Produces Unexpected Results?

I ran into a VERY strange behavior in Qlik Sense set analysis. I'm hoping someone here can explain why set analysis is producing different/unexpected results on identical data set.

The app that demonstrates the issue is attached.

 

Context

I have one table that looks like this:

mbespartochnyy_0-1732124810474.png

Here's the script for that table:

 

 

table_temp:
Load *
Inline [
id|name|name_qa_results|super_qa_results
1|name_1|Match|Match
2|name_2|Match|Match
3|name_3|Match|Match
4|name_4|Match|Match
5|name_5|Match|Match
6|name_6|Match|Match
7|name_7
8|name_8
9|name_9|Match|Match
10|name_10|Match|Match
11|name_11|Match|Match
12|name_12
13|name_13
14|name_14
15|name_15
16|name_16|Match|Match
](delimiter is '|');

table:
NoConcatenate
Load id
	, If(Len(Trim(name_qa_results)) = 0, Null(), name_qa_results) as name_qa_results
    , If(Len(Trim(super_qa_results)) = 0, Null(), super_qa_results) as super_qa_results
Resident table_temp;

Drop Table table_temp;

 

 

 

Then I have an expression in a text object that counts IDs that have a 'Match' value in super_qa_results field or an id number 7, like this:

 

 

Count({<super_qa_results = {'Match'}> + <id = {7}>} distinct id)

 

 

This expression produces accurate count which is 11:

mbespartochnyy_1-1732125191813.png

 

The Problem

However, if I create an identical table that has two other tables associated with it and one of the associated tables has only one of all possible IDs, then the same expression produces the wrong count.

Here's what the data model for the second table looks like:

mbespartochnyy_2-1732125501766.png

and here's the script to generate the second table along with the linked tables:

 

 

table_temp_2:
Load *
Inline [
id_2|name_2|name_qa_results_2|super_qa_results_2
1|name_1|Match|Match
2|name_2|Match|Match
3|name_3|Match|Match
4|name_4|Match|Match
5|name_5|Match|Match
6|name_6|Match|Match
7|name_7
8|name_8
9|name_9|Match|Match
10|name_10|Match|Match
11|name_11|Match|Match
12|name_12
13|name_13
14|name_14
15|name_15
16|name_16|Match|Match
](delimiter is '|');

table_2:
NoConcatenate
Load id_2
	, If(Len(Trim(name_qa_results_2)) = 0, Null(), name_qa_results_2) as name_qa_results_2
    , If(Len(Trim(super_qa_results_2)) = 0, Null(), super_qa_results_2) as super_qa_results_2
Resident table_temp_2;

link:
Load Distinct id_2
	, name_2
Resident table_temp_2
;

Drop Table table_temp_2;

dbOwner:
Load *
Inline [
id_2|db
16|db_1
16|db_2
16|db_3
16|db_4
](delimiter is '|');

 

 

 

and here's the screenshot of same text object with same expression (except it referencing fields from the second table, of course) generating wrong results:

mbespartochnyy_3-1732125633400.png

Both table and table_2 have identical data, yet when I add two other tables to be associated with table_2, the set analysis starts to produce wrong results.

Any idea why set analysis is producing different results in this example?

Labels (3)
3 Solutions

Accepted Solutions
Muthukumar_77
Contributor III
Contributor III

Try like this,

table_temp_2:
Load *
Inline [
id_2|name_2|name_qa_results_2|super_qa_results_2
1|name_1|Match|Match
2|name_2|Match|Match
3|name_3|Match|Match
4|name_4|Match|Match
5|name_5|Match|Match
6|name_6|Match|Match
7|name_7
8|name_8
9|name_9|Match|Match
10|name_10|Match|Match
11|name_11|Match|Match
12|name_12
13|name_13
14|name_14
15|name_15
16|name_16|Match|Match
](delimiter is '|');
 
table_2:
NoConcatenate
Load id_2 as id_2_Key
,id_2
,If(Len(Trim(name_qa_results_2)) = 0, Null(), name_qa_results_2) as name_qa_results_2
    , If(Len(Trim(super_qa_results_2)) = 0, Null(), super_qa_results_2) as super_qa_results_2
Resident table_temp_2;
 
link:
Load Distinct id_2 as id_2_Key
, name_2
Resident table_temp_2
;
 
Drop Table table_temp_2;
 
dbOwner:
Load id_2 as id_2_Key,db
Inline [
id_2|db
16|db_1
16|db_2
16|db_3
16|db_4
](delimiter is '|');

 

Test1.png

 

Test2.png

 

Thanks Regards,
Muthukumar P
Qlik Developer

View solution in original post

lennart_mo
Creator II
Creator II

Hi @mbespartochnyy,

I'm confused about this happening as well.

From testing I got that the dbOwner table seems to cause your problem, explicitly the limited dataset for id. A workaround I found to work was populating the dbOwner table with the missing IDs. To do so I used this script:

dbOwner:
Load DISTINCT id_2
Resident table_2;

Left Join(dbOwner)
Load *
Inline [
id_2|db
16|db_1
16|db_2
16|db_3
16|db_4
](delimiter is '|');

 

View solution in original post

mbespartochnyy
Creator III
Creator III
Author

Thanks for input everyone! I appreciate the ideas.

For additional context, what the set analysis in my example is meant to accomplish is an OR type of logic. More specifically, I need the expression to count number of IDs if a QA field (super_qa_results_2 in the post example) or an additional field equals to some specific value. So, number of IDs where super_qa_results_2 equals 'Match' or id_2 equals 7.

This OR logic seems to break, or at least produce unexpected results, when a dimension table with incomplete list of IDs is introduced.

 

Workarounds

I tried what @Muthukumar_77 and @lennart_mo suggested and can confirm that those workarounds worked. Thanks again for sharing your thoughts and suggestions on how to work around this issue!

Ultimately, I ended up moving the OR logic out of the set analysis and into the script by using the IF statement, something like this:

If(super_qa_results_2 = 'Match' or id_2 = 7, 1, 0) as audit_check

 

Then I rewrote the expression in the KPI object to be:

Count({<audit_flag = {1}>} distinct id_2)

 

And that does produce accurate count:

mbespartochnyy_0-1732198325631.png

But it would be nice if the OR logic was working fine within set analysis. Or at least if there was an explanation as to why set analysis produces unexpected results when an OR type of operator is used in a data model that contains a dimension table with incomplete key field values.

In my experience a dimension table with an incomplete set of (foreign) key values is a normal occurrence. I wouldn't expect there to be a need for every dimension table to have a complete list of key field values to have the OR logic in set analysis working.

P.S. If anyone will figure out why the set analysis behaves the way it does when a dimension table with incomplete key field values is introduced, would you please share your findings? I would love to get a better understanding of what's going on within the set analysis in this scenario.

P.P.S. Updated app with the workaround in place is attached.

View solution in original post

18 Replies
Qrishna
Master
Master

I wasnt able to open your qvf.

I see there no change in the count for both the codes you have provided:

2493727 - Set Analysis on Key Fields Produces Different Results (1).PNG

 

if you still encounter the same issue, i would suggest creating a flag and use the flag in set expression, something like below:

table_2:
NoConcatenate
Load id_2
, '0' as Table_Flag
, If(Len(Trim(name_qa_results_2)) = 0, Null(), name_qa_results_2) as name_qa_results_2
, If(Len(Trim(super_qa_results_2)) = 0, Null(), super_qa_results_2) as super_qa_results_2
Resident table_temp_2;

the xpression: would be: = Count({<super_qa_results_2 = {'Match'}> + <id_2 = {7}, Table_Flag={0}>} distinct id_2)

2493727 - Set Analysis on Key Fields Produces Different Results (2).PNG

mbespartochnyy
Creator III
Creator III
Author

It looks like you're using QlikView. The results that you're seeing in QlikView are the results that I was expecting Qlik Sense would also produce.

It looks like there might be a bug in Qlik Sense that produces wrong results.

I wonder if anyone else in the community can see the same problem in Qlik Sense that I'm seeing on my server.

traemilietw
Contributor II
Contributor II

On what grounds do you base that conclusion?

mbespartochnyy
Creator III
Creator III
Author

It's not a conclusion. At this point it's only hypothesis, but it's based on the fact that same script and same expression are producing different results in QlikView vs. Qlik Sense.

Qrishna
Master
Master

i strongly believe te above code should work same both in qlikvew and sense as we are not using any functions that arent available in qlikview.

your 'dbOwner' table has only one unique id, sometime if you dont explicilty provide the info in the set expressions there are high chances of it picking up the data from recent table loaded ( i have seen in some cases) and sometimes it picks up all the available ids from all the tables when they are linked together (some other cases)

traemilietw
Contributor II
Contributor II

I appreciate your explanation. Thank you. I hope my question was not out of line.

mbespartochnyy
Creator III
Creator III
Author

I agree @Qrishna. Both QlikView and Qlik Sense should produce same results in this case.

I tried rearranging the LOAD script for the 'dbOwner' table to be before the LOAD script of 'table_2' table but the expression in Qlik Sense produced the same, wrong answer.

Qrishna
Master
Master

did you try the flag method? what are the results?

mbespartochnyy
Creator III
Creator III
Author

I did. I added the flag field to table_2:

mbespartochnyy_0-1732135250462.png

 

Reloaded the app and made sure the field shows up in the data model and it did:

mbespartochnyy_1-1732135351643.png

 

and I've updated the expression to also include the table_flag field:

mbespartochnyy_2-1732135422191.png

But the expression is still returning 1.