Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
w21450
Contributor III
Contributor III

Counting specific overlapping rows with set expression

Hi everyone

 

I have an issue with set analysis that I hope someone in here can help with:)

I will try to simplify my question as best I can. I have attached a screenshot of some sample data, so it makes more sense when reading this.

I have a datamodel where I have a lot of task with different IDs (S1, S2, S3, S4 etc.).  These tasks can have two different subtypes on any given day. So on 01/10/2021 Task S1 can have two different subtypes  (T1 and T20). This would result in two different rows with the same date and with the same task ID (S1) but different subtypes (T1 and T20). So the subtypes can overlap on the same day. 

If you look at the attached file, I want to Count Distinct on column A which is called "Task ID". When there are no overlapping subtypes this is easy. In a set expression I just distinct count all rows with type = T1. However, because of our specific datamodel and its foreign keys it is very important that when the subtypes overlap on the same date, the Task ID should only be counted if it has type = T20 and NOT if it has type T1.

So while there are no overlapping types on the same date the all TaskID's with T1 should be counted but when they overlap only Task ID rows with T20 should be counted.

In the attached picture I have drawn a box around the area where they overlap. And in column D I have marked if the row should be counted (Column D is not part of the data but just a column I put in, so it would be easier to explain to you).

Is there any way to do this in a set expression? Or any other way?

Thanks in advance.

 

Best,

Peter

1 Solution

Accepted Solutions
Vegar
MVP
MVP

An alternative to Rubenbarin (if you have or are able to create) a field that is unique per TaskID-date combination is to use inplicit set analysis. 

Consider this script:

LOAD
AutoNumberHash128(Date,[Task ID] ) as [DateTaskID] , //Unique per ID and date
*
inline
[
Date, Task ID, Type, Sould be counted
2021-10-07, S1,T1, x
2021-10-07, S2,T1,
2021-10-07, S2,T20, x
2021-10-08, S1,T1,
2021-10-08, S1,T20,x
2021-10-09, S1,T1,
2021-10-09, S1,T20,x
2021-10-10, S1,T1,
2021-10-10, S1,T20,x
2021-10-15, S1,T1, x
2021-10-16, S1,T1,x
];

 

Then you can calculate what you are asking for by this expression:

=count({<DateTaskID=P({$<Type={'T20'}>}),Type = {'T20'} >} DateTaskID)
+ count({<DateTaskID=E({$<Type={'T20'}>}),Type = {'T1'}>} DateTaskID)

View solution in original post

6 Replies
rubenmarin

Hi, you can try flagging records while loading, doing a sorted load and using peek to flag records, with this example can be something like:

LOAD
  Date,
  [Task ID],
  Type,
  If(Peek(Date)=Date and Peek([Task ID])=[Task ID] and Type='T1' and Peek(Type)='T20', 0, 1) as flagCounted
Resident/from...
Order By [Task ID], Date, Type desc;

So the rows with T1 where there is a T20 for the same task and date are flagged as 0, so you can use a set analysis with flagCounted={1} to avoid them.

Vegar
MVP
MVP

An alternative to Rubenbarin (if you have or are able to create) a field that is unique per TaskID-date combination is to use inplicit set analysis. 

Consider this script:

LOAD
AutoNumberHash128(Date,[Task ID] ) as [DateTaskID] , //Unique per ID and date
*
inline
[
Date, Task ID, Type, Sould be counted
2021-10-07, S1,T1, x
2021-10-07, S2,T1,
2021-10-07, S2,T20, x
2021-10-08, S1,T1,
2021-10-08, S1,T20,x
2021-10-09, S1,T1,
2021-10-09, S1,T20,x
2021-10-10, S1,T1,
2021-10-10, S1,T20,x
2021-10-15, S1,T1, x
2021-10-16, S1,T1,x
];

 

Then you can calculate what you are asking for by this expression:

=count({<DateTaskID=P({$<Type={'T20'}>}),Type = {'T20'} >} DateTaskID)
+ count({<DateTaskID=E({$<Type={'T20'}>}),Type = {'T1'}>} DateTaskID)

w21450
Contributor III
Contributor III
Author

Hi Vegar

Thanks a lot for your answer. It really helps and I have almost solved it now. However, I have run into a problem.

When I do the implicit set analysis, I tell Qlik to look at all the TasID's that do NOT include 'T20'. Somehow this locks Qlik so i always looks at all these TaskID's. So when I make a bar chart with the set analysis you wrote above, and then filter the app on a specifik TaskID, it does not react. It shows me ALL the TaskID's in the app. Is there someway to tell Qlik to make this set analysis, but still react to filtrations on the field TaskID when using the app?

Best,
Peter

marcus_sommer

Yes, you could connect your set analysis statement with the current selections by using this:

sum({< Field1 *= {'any value'}>} Field2)

- Marcus

Vegar
MVP
MVP

As @marcus_sommer is saying *= should solve your issue.

Like this:

=count({<DateTaskID*= P({$<Type={'T20'}>}),Type = {'T20'} >} DateTaskID)
+ count({<DateTaskID*=E({$<Type={'T20'}>}),Type = {'T1'}>} DateTaskID)

w21450
Contributor III
Contributor III
Author

Thanks a lot!! That helped 🙂