Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I ran into a VERY strange behavior in Qlik Sense set analysis. I'm hoping someone here can explain why set analysis is producing different/unexpected results on identical data set.
The app that demonstrates the issue is attached.
Context
I have one table that looks like this:
Here's the script for that table:
table_temp:
Load *
Inline [
id|name|name_qa_results|super_qa_results
1|name_1|Match|Match
2|name_2|Match|Match
3|name_3|Match|Match
4|name_4|Match|Match
5|name_5|Match|Match
6|name_6|Match|Match
7|name_7
8|name_8
9|name_9|Match|Match
10|name_10|Match|Match
11|name_11|Match|Match
12|name_12
13|name_13
14|name_14
15|name_15
16|name_16|Match|Match
](delimiter is '|');
table:
NoConcatenate
Load id
, If(Len(Trim(name_qa_results)) = 0, Null(), name_qa_results) as name_qa_results
, If(Len(Trim(super_qa_results)) = 0, Null(), super_qa_results) as super_qa_results
Resident table_temp;
Drop Table table_temp;
Then I have an expression in a text object that counts IDs that have a 'Match' value in super_qa_results field or an id number 7, like this:
Count({<super_qa_results = {'Match'}> + <id = {7}>} distinct id)
This expression produces accurate count which is 11:
The Problem
However, if I create an identical table that has two other tables associated with it and one of the associated tables has only one of all possible IDs, then the same expression produces the wrong count.
Here's what the data model for the second table looks like:
and here's the script to generate the second table along with the linked tables:
table_temp_2:
Load *
Inline [
id_2|name_2|name_qa_results_2|super_qa_results_2
1|name_1|Match|Match
2|name_2|Match|Match
3|name_3|Match|Match
4|name_4|Match|Match
5|name_5|Match|Match
6|name_6|Match|Match
7|name_7
8|name_8
9|name_9|Match|Match
10|name_10|Match|Match
11|name_11|Match|Match
12|name_12
13|name_13
14|name_14
15|name_15
16|name_16|Match|Match
](delimiter is '|');
table_2:
NoConcatenate
Load id_2
, If(Len(Trim(name_qa_results_2)) = 0, Null(), name_qa_results_2) as name_qa_results_2
, If(Len(Trim(super_qa_results_2)) = 0, Null(), super_qa_results_2) as super_qa_results_2
Resident table_temp_2;
link:
Load Distinct id_2
, name_2
Resident table_temp_2
;
Drop Table table_temp_2;
dbOwner:
Load *
Inline [
id_2|db
16|db_1
16|db_2
16|db_3
16|db_4
](delimiter is '|');
and here's the screenshot of same text object with same expression (except it referencing fields from the second table, of course) generating wrong results:
Both table and table_2 have identical data, yet when I add two other tables to be associated with table_2, the set analysis starts to produce wrong results.
Any idea why set analysis is producing different results in this example?
Try like this,
Hi @mbespartochnyy,
I'm confused about this happening as well.
From testing I got that the dbOwner table seems to cause your problem, explicitly the limited dataset for id. A workaround I found to work was populating the dbOwner table with the missing IDs. To do so I used this script:
dbOwner:
Load DISTINCT id_2
Resident table_2;
Left Join(dbOwner)
Load *
Inline [
id_2|db
16|db_1
16|db_2
16|db_3
16|db_4
](delimiter is '|');
Thanks for input everyone! I appreciate the ideas.
For additional context, what the set analysis in my example is meant to accomplish is an OR type of logic. More specifically, I need the expression to count number of IDs if a QA field (super_qa_results_2 in the post example) or an additional field equals to some specific value. So, number of IDs where super_qa_results_2 equals 'Match' or id_2 equals 7.
This OR logic seems to break, or at least produce unexpected results, when a dimension table with incomplete list of IDs is introduced.
Workarounds
I tried what @Muthukumar_77 and @lennart_mo suggested and can confirm that those workarounds worked. Thanks again for sharing your thoughts and suggestions on how to work around this issue!
Ultimately, I ended up moving the OR logic out of the set analysis and into the script by using the IF statement, something like this:
If(super_qa_results_2 = 'Match' or id_2 = 7, 1, 0) as audit_check
Then I rewrote the expression in the KPI object to be:
Count({<audit_flag = {1}>} distinct id_2)
And that does produce accurate count:
But it would be nice if the OR logic was working fine within set analysis. Or at least if there was an explanation as to why set analysis produces unexpected results when an OR type of operator is used in a data model that contains a dimension table with incomplete key field values.
In my experience a dimension table with an incomplete set of (foreign) key values is a normal occurrence. I wouldn't expect there to be a need for every dimension table to have a complete list of key field values to have the OR logic in set analysis working.
P.S. If anyone will figure out why the set analysis behaves the way it does when a dimension table with incomplete key field values is introduced, would you please share your findings? I would love to get a better understanding of what's going on within the set analysis in this scenario.
P.P.S. Updated app with the workaround in place is attached.
I wasnt able to open your qvf.
I see there no change in the count for both the codes you have provided:
if you still encounter the same issue, i would suggest creating a flag and use the flag in set expression, something like below:
table_2:
NoConcatenate
Load id_2
, '0' as Table_Flag
, If(Len(Trim(name_qa_results_2)) = 0, Null(), name_qa_results_2) as name_qa_results_2
, If(Len(Trim(super_qa_results_2)) = 0, Null(), super_qa_results_2) as super_qa_results_2
Resident table_temp_2;
the xpression: would be: = Count({<super_qa_results_2 = {'Match'}> + <id_2 = {7}, Table_Flag={0}>} distinct id_2)
It looks like you're using QlikView. The results that you're seeing in QlikView are the results that I was expecting Qlik Sense would also produce.
It looks like there might be a bug in Qlik Sense that produces wrong results.
I wonder if anyone else in the community can see the same problem in Qlik Sense that I'm seeing on my server.
On what grounds do you base that conclusion?
It's not a conclusion. At this point it's only hypothesis, but it's based on the fact that same script and same expression are producing different results in QlikView vs. Qlik Sense.
i strongly believe te above code should work same both in qlikvew and sense as we are not using any functions that arent available in qlikview.
your 'dbOwner' table has only one unique id, sometime if you dont explicilty provide the info in the set expressions there are high chances of it picking up the data from recent table loaded ( i have seen in some cases) and sometimes it picks up all the available ids from all the tables when they are linked together (some other cases)
I appreciate your explanation. Thank you. I hope my question was not out of line.
I agree @Qrishna. Both QlikView and Qlik Sense should produce same results in this case.
I tried rearranging the LOAD script for the 'dbOwner' table to be before the LOAD script of 'table_2' table but the expression in Qlik Sense produced the same, wrong answer.
did you try the flag method? what are the results?
I did. I added the flag field to table_2:
Reloaded the app and made sure the field shows up in the data model and it did:
and I've updated the expression to also include the table_flag field:
But the expression is still returning 1.