
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Counting distinct unique values based on value in other fields
Hello,
I'm new to Qlik sense and trying to figure out something quite simple. My real database has many thousand rows with data which I need to analyze. My problem is giving me some sleepless nights and now I'm trying to find an answer here.
I have some data and would like to get some statistical numbers from them, nothing fancy just stuff like this:
- How many DISTINCT values - without duplicates - are in a particular column depending on the value of a different column.
- How many DISTINCT duplicates are in a particular column depending on the value of a different column.
- How many duplicates does every DISTINCT duplicate have depending on the value of a different column.
I tried already different approaches with different loading scripts, but unfortunately nothing works the way I like to have it.
Here is an example to explain it a little better:
The test data:
[Test_Data]
LOAD * Inline [
ID, TXT, NUM
0, TXT1, 100001
1, TXT1, 100000
2, TXT1, 100000
3, TXT1, 100000
4, TXT1, 100000
5, TXT1, 100002
6, TXT2, 100001
7, TXT2, 100002
8, TXT2, 100002
9, TXT2, 100000
10, TXT3, 100001
11, TXT3, 100002
12, TXT3, 100005
13, TXT3, 100013
14, TXT3, 100004
15, TXT3, 100004
16, TXT3, 100010
16, TXT3, 100010]
The answers I’m looking for are:
values appearing only once:
- TXT1 - count: 2
- 100001
- 100002
- TXT2 – count: 2
- 100001
- 100000
- TXT3 – count: 4
- 100001
- 100002
- 100005
- 100013
values appearing more than once:
- TXT1 count: 1
- 100000
- TXT2 count: 1
- 100002
- TXT3 count: 2
- 100004
- 100010
duplicate appearing:
- TXT1
- 100000 – count: 4
- TXT2
- 100002 – count: 2
- TXT3
- 100004 – count: 2
- 100010 – count: 2
I tried this:
For values appearing only once:
- Count(DISTINCT{<NUM = {"=Count(NUM) = 1"}>}NUM)
- Does not work because it gets the values independent from TXT
- Sum(Aggr(Count(DISTINCT{<NUM = {"=Count(NUM) = 1"}>}NUM), TXT)
- Doesn’t work, gives me strange results
For values appearing more than once:
- Count(DISTINCT{<NUM = {"=Count(NUM) > 1"}>}NUM)
- Does not work because it gets the values independent from TXT
- Sum(Aggr(Count(DISTINCT{<NUM = {"=Count(NUM) > 1"}>}NUM), TXT)
- Doesn’t work, gives me strange results
For duplicate:
- Count({<NUM = {"=Count(NUM) > 1"}>}NUM)
- Does not work because it gets the values independent from TXT
- Sum(Aggr(Count({<NUM = {"=Count(NUM) > 1"}>}NUM), TXT)
- Doesn’t work, gives me strange results
Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would recommend adding the count of values to the data table itself, In this case since you are looking for duplicates across two dimensions; I would concat the two fields into a keyfield.
For example:
[Test_Data]:
LOAD * Inline [
ID, TXT, NUM
0, TXT1, 100001
1, TXT1, 100000
2, TXT1, 100000
3, TXT1, 100000
4, TXT1, 100000
5, TXT1, 100002
6, TXT2, 100001
7, TXT2, 100002
8, TXT2, 100002
9, TXT2, 100000
10, TXT3, 100001
11, TXT3, 100002
12, TXT3, 100005
13, TXT3, 100013
14, TXT3, 100004
15, TXT3, 100004
16, TXT3, 100010
16, TXT3, 100010]
LEFT Join (TEST_DATA)
LOAD *,
TXT&'-'&NUM as DuplicateIdentifier
Resident TEST_DATA;
LEFT Join (TEST_DATA)
LOAD
DuplicateIdentifier,
count(DuplicateIdentifier) as CountOccurrence
Resident TEST_DATA
Group by DuplicateIdentifier;
Then you can use SetAnalysis to search for CountOccurrence = {">1"} etc.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
try below expressions
//Distinct Values
=count(aggr(if(Count(NUM)=1,NUM),TXT,NUM))
// Duplicate Values
=count(aggr(if(Count(NUM)>1,NUM),TXT,NUM))
//Duplicate Value Counts
=Concat(aggr(if(Count(NUM)>1,NUM&' : '&count(NUM)),TXT,NUM),',')

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would recommend adding the count of values to the data table itself, In this case since you are looking for duplicates across two dimensions; I would concat the two fields into a keyfield.
For example:
[Test_Data]:
LOAD * Inline [
ID, TXT, NUM
0, TXT1, 100001
1, TXT1, 100000
2, TXT1, 100000
3, TXT1, 100000
4, TXT1, 100000
5, TXT1, 100002
6, TXT2, 100001
7, TXT2, 100002
8, TXT2, 100002
9, TXT2, 100000
10, TXT3, 100001
11, TXT3, 100002
12, TXT3, 100005
13, TXT3, 100013
14, TXT3, 100004
15, TXT3, 100004
16, TXT3, 100010
16, TXT3, 100010]
LEFT Join (TEST_DATA)
LOAD *,
TXT&'-'&NUM as DuplicateIdentifier
Resident TEST_DATA;
LEFT Join (TEST_DATA)
LOAD
DuplicateIdentifier,
count(DuplicateIdentifier) as CountOccurrence
Resident TEST_DATA
Group by DuplicateIdentifier;
Then you can use SetAnalysis to search for CountOccurrence = {">1"} etc.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
try below expressions
//Distinct Values
=count(aggr(if(Count(NUM)=1,NUM),TXT,NUM))
// Duplicate Values
=count(aggr(if(Count(NUM)>1,NUM),TXT,NUM))
//Duplicate Value Counts
=Concat(aggr(if(Count(NUM)>1,NUM&' : '&count(NUM)),TXT,NUM),',')

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
TM_burgers, tank you for the fast answer.
Your solution is what I was looking for and also very useful if one need the count values in the table for later usage.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Kush, thank you for the fast answer,
Your answer is exactly what I was looking for. I didn't quite understood the "AGGR" function until now. Well, I guess there's a lot left for me to learn with this Qlik Sense, but it's very exiting stuff.
