Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi everyone
I have an issue with set analysis that I hope someone in here can help with:)
I will try to simplify my question as best I can. I have attached a screenshot of some sample data, so it makes more sense when reading this.
I have a datamodel where I have a lot of task with different IDs (S1, S2, S3, S4 etc.). These tasks can have two different subtypes on any given day. So on 01/10/2021 Task S1 can have two different subtypes (T1 and T20). This would result in two different rows with the same date and with the same task ID (S1) but different subtypes (T1 and T20). So the subtypes can overlap on the same day.
If you look at the attached file, I want to Count Distinct on column A which is called "Task ID". When there are no overlapping subtypes this is easy. In a set expression I just distinct count all rows with type = T1. However, because of our specific datamodel and its foreign keys it is very important that when the subtypes overlap on the same date, the Task ID should only be counted if it has type = T20 and NOT if it has type T1.
So while there are no overlapping types on the same date the all TaskID's with T1 should be counted but when they overlap only Task ID rows with T20 should be counted.
In the attached picture I have drawn a box around the area where they overlap. And in column D I have marked if the row should be counted (Column D is not part of the data but just a column I put in, so it would be easier to explain to you).
Is there any way to do this in a set expression? Or any other way?
Thanks in advance.
Best,
Peter
An alternative to Rubenbarin (if you have or are able to create) a field that is unique per TaskID-date combination is to use inplicit set analysis.
Consider this script:
LOAD
AutoNumberHash128(Date,[Task ID] ) as [DateTaskID] , //Unique per ID and date
*
inline
[
Date, Task ID, Type, Sould be counted
2021-10-07, S1,T1, x
2021-10-07, S2,T1,
2021-10-07, S2,T20, x
2021-10-08, S1,T1,
2021-10-08, S1,T20,x
2021-10-09, S1,T1,
2021-10-09, S1,T20,x
2021-10-10, S1,T1,
2021-10-10, S1,T20,x
2021-10-15, S1,T1, x
2021-10-16, S1,T1,x
];
Then you can calculate what you are asking for by this expression:
=count({<DateTaskID=P({$<Type={'T20'}>}),Type = {'T20'} >} DateTaskID)
+ count({<DateTaskID=E({$<Type={'T20'}>}),Type = {'T1'}>} DateTaskID)
Hi, you can try flagging records while loading, doing a sorted load and using peek to flag records, with this example can be something like:
LOAD
Date,
[Task ID],
Type,
If(Peek(Date)=Date and Peek([Task ID])=[Task ID] and Type='T1' and Peek(Type)='T20', 0, 1) as flagCounted
Resident/from...
Order By [Task ID], Date, Type desc;
So the rows with T1 where there is a T20 for the same task and date are flagged as 0, so you can use a set analysis with flagCounted={1} to avoid them.
An alternative to Rubenbarin (if you have or are able to create) a field that is unique per TaskID-date combination is to use inplicit set analysis.
Consider this script:
LOAD
AutoNumberHash128(Date,[Task ID] ) as [DateTaskID] , //Unique per ID and date
*
inline
[
Date, Task ID, Type, Sould be counted
2021-10-07, S1,T1, x
2021-10-07, S2,T1,
2021-10-07, S2,T20, x
2021-10-08, S1,T1,
2021-10-08, S1,T20,x
2021-10-09, S1,T1,
2021-10-09, S1,T20,x
2021-10-10, S1,T1,
2021-10-10, S1,T20,x
2021-10-15, S1,T1, x
2021-10-16, S1,T1,x
];
Then you can calculate what you are asking for by this expression:
=count({<DateTaskID=P({$<Type={'T20'}>}),Type = {'T20'} >} DateTaskID)
+ count({<DateTaskID=E({$<Type={'T20'}>}),Type = {'T1'}>} DateTaskID)
Hi Vegar
Thanks a lot for your answer. It really helps and I have almost solved it now. However, I have run into a problem.
When I do the implicit set analysis, I tell Qlik to look at all the TasID's that do NOT include 'T20'. Somehow this locks Qlik so i always looks at all these TaskID's. So when I make a bar chart with the set analysis you wrote above, and then filter the app on a specifik TaskID, it does not react. It shows me ALL the TaskID's in the app. Is there someway to tell Qlik to make this set analysis, but still react to filtrations on the field TaskID when using the app?
Best,
Peter
Yes, you could connect your set analysis statement with the current selections by using this:
sum({< Field1 *= {'any value'}>} Field2)
- Marcus
As @marcus_sommer is saying *= should solve your issue.
Like this:
=count({<DateTaskID*= P({$<Type={'T20'}>}),Type = {'T20'} >} DateTaskID)
+ count({<DateTaskID*=E({$<Type={'T20'}>}),Type = {'T1'}>} DateTaskID)
Thanks a lot!! That helped 🙂