Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
andymanu
Creator II
Creator II

Calculate distinct values from rows based on multiple dimension fileds

Hi All,

This question I am posting as an extension to one of my previous question posted on the below link,

https://community.qlik.com/t5/New-to-QlikView/Calculate-distinct-number-of-names-from-a-row-based-on...

Assume my data set looks something like below, (Please find the ached excel sheet herewith)

Trainee NoTrainee NameStaff 1Staff 2Staff 3RoomSession_StartDate
100MarkMatt  ScodRoom-012/10/2018 14:302/10/2018
101AntonSylan  Room-023/10/2018 10:303/10/2018
102AndrewSylanMatt (Dept A) Room-012/10/2018 14:302/10/2018
103AshScod  Room-035/10/2018 10:305/10/2018
104PrishScodBrenden (Dept B) Room-0112/10/2018 14:3012/10/2018
105DavidBrendenMatt (Dept A)Sylan (Dept C)Room-012/10/2018 10:302/10/2018
101AntonSylan  Room-035/10/2018 10:305/10/2018
103AshScodBrenden (Dept B) Room-012/10/2018 14:302/10/2018
105DavidScod  Room-0116/10/2018 14:3016/10/2018
101AntonPaul  Room-012/10/2018 10:302/10/2018
100MarkScodBrenden (Dept B) Room-035/10/2018 10:305/10/2018
101AntonSylan  Room-0116/10/2018 14:3016/10/2018
105DavidScodSylan Room-0110/10/2018 14:3010/10/2018

 

My main intention is to calculate the number of staff allocated to each sessions (based on the unique "Room" and the "Session_Start" combination).

As depicted above,  each session will have number of Trainee Staff and assigned Staff members for the training programs.

Two additional requirements I want to achieve at this stage are;

1. When you group the data based on "Room" and "Session_Start", you wil have a session record, "Room-01" and "2/10/2018 14:30"  and the respective Staff aloocation data should be, Matt, Sylan, Scod and Brenden ( a total of 4).

The Staff member "Scod" can be in either the field (Staff 1 and Staff 3 according to the example) and the count should not consinder it as two records, since only distinct 4 staff members have assigned to the above said session.

2. Further, certain names can have their respective beloinging department followed by the name within a brackets. Thus, I am also looking for a mechanism to identify this as a single instance meaning, even if you get the name or the name followed by the department name should consider as the same staff member. Thus again, the count should be as 4 for the allocated staff for the "Room-01" and ' "2/10/2018 14:30" session.

Thanks a lot in advance if someone could help me in this regards.

Kind regards,

Andy

 

 

1 Solution

Accepted Solutions
jonathandienst
Partner - Champion III
Partner - Champion III

Something like this:

CrossTable(Order, StaffName, 2)
LOAD
    [Room],
    [Session_Start],
    [Staff 1],
    SubField([Staff 2], ' ', 1) as [Staff 2],
    SubField([Staff 3], ' ', 1) as [Staff 3]
FROM [Staff Allocation.xlsx]
(ooxml, embedded labels, table is [Group Activity]);

LOAD
    [Trainee No],
    [Trainee Name],
    [Room],
    [Session_Start],
    [Date]
FROM [Staff Allocation.xlsx]
(ooxml, embedded labels, table is [Group Activity]);

Drop Field Order;  // this field not required

This will create a composite (synthetic) key. In this case, the synthetic key is a legitimate  composite key and can be left in place. If you want to create your own comoposite to replace the synthetic key, then go ahead - but it is not really necessary.

Now create a table with Room and Session Start as dimensions and Count(Distinct StaffName) as the expression/measure.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein

View solution in original post

4 Replies
tresesco
MVP
MVP

Hi Andy,
With more such new requirements things would start becoming hard to harder for you with such table format. As I suggested earlier (in the other thread), consider transforming the table structure using Crosstable(), that would make your life easier here.
jonathandienst
Partner - Champion III
Partner - Champion III

Something like this:

CrossTable(Order, StaffName, 2)
LOAD
    [Room],
    [Session_Start],
    [Staff 1],
    SubField([Staff 2], ' ', 1) as [Staff 2],
    SubField([Staff 3], ' ', 1) as [Staff 3]
FROM [Staff Allocation.xlsx]
(ooxml, embedded labels, table is [Group Activity]);

LOAD
    [Trainee No],
    [Trainee Name],
    [Room],
    [Session_Start],
    [Date]
FROM [Staff Allocation.xlsx]
(ooxml, embedded labels, table is [Group Activity]);

Drop Field Order;  // this field not required

This will create a composite (synthetic) key. In this case, the synthetic key is a legitimate  composite key and can be left in place. If you want to create your own comoposite to replace the synthetic key, then go ahead - but it is not really necessary.

Now create a table with Room and Session Start as dimensions and Count(Distinct StaffName) as the expression/measure.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
andymanu
Creator II
Creator II
Author

Hi Jonty/Tresesco,
Thanks a lot for your feedback.
I was trying to avoid the CrossTable() function but seems it is producing some excellent options to cater the new requirements.
For the given data set it works fine. However, actually, the real data set I am using got about 50000 data records for the last four months.
Additionally, I am performing some other manipulations to the data to fulfill my final requirements.
Nevertheless, I'll apply the CrossTable() function on the real data set and let you know the outcome.
Thanks for both of your's vital feedback.
Kind regards,
Andy

andymanu
Creator II
Creator II
Author

Hi,
Could you please let me know the purpose of using the "Order" field?
Thanks.
Andrew