Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a user table that is being loaded from a QVD.
I have been trying to load only the most up to date user details per username without success.
The fields I have are
Active (0/1)
EmployeeNum (Integer)
Username (String AD Username)
I need to return
Usernames and EmployeeNum with the max(Active)
This should give me all active users and any inactive users who have a username that an active employee isn't currently using.
Example table.
Active | EmployeeNum | Username |
0 | 20 | AndrewWilliams |
0 | 21 | FredJones |
0 | 22 | PeterSmith |
1 | 23 | SamDavis |
1 | 24 | FredJones |
1 | 25 | PeterSmith |
I don't want to return EmployeeNum 21 and 22 from the above example as there usernames have been reused for employee 24 and 25.
What I want to return is as follows
Active | EmployeeNum | Username |
0 | 20 | AndrewWilliams |
1 | 23 | SamDavis |
1 | 24 | FredJones |
1 | 25 | PeterSmith |
Anyone done this sort of thing before?.
I tried using Max() in the load with Group By Active, but this didn't work.
I don't want to go back to the SQL database to pull this info as its in the US and we are in the UK and very slow to pull from, so I want to pull from our nightly QVD extract.
Many thanks in advance for any help.
Hi Peter
This is a classic 'let us know what the most recent details against each of our employees is' scenario which is best achieved in script.
In your quest you actually have anomalous issues. For example, you want to know the most recent status against each employee, but each employee may no longer be active. Consequently you can't simply query your data for the current 'Active' employees, i.e select all where the active status is 1. Instead you are looking for the most recent entry against each employee name to determine the current status of each name. In your database ideally you would have an audit table that would show the dates on which each employee transitioned from active to inactive. However, you do not show that data so I will assume it does not exist. However, from your data it appears that each new employee gets a new employee number even if they have the same name as a previous employee so we can use that to achieve your desired result.
Firstly, let us consider what consistent information we have. In your data the consistent data is the Employee Name and the Active status. This is a risky strategy as it is prone to issues from spelling mistakes/entry variations in the name but let's run with it.
Firstly, I loaded a copy of your raw data:
Data:
Load * inline [
Active, EmployeeNum, Username
0, 20, AndrewWilliams
0, 21, FredJones
0, 22, PeterSmith
1, 23, SamDavis
1, 24, FredJones
1, 25, PeterSmith
];
I used an inline load for simplicity but I assume this would come from your database. No matter where it comes from the key point is that you have the data in the three fields listed in a table that resides in memory in QlikView.
As the employee number is the key identifier in your data we next need to identify the most recent identifier for each name that you have in your data. This is achieved by:
left join (Data)
Load
Username,
Firstsortedvalue(EmployeeNum,-EmployeeNum) as LatestEmployeeNumber
Resident Data
Group By Username
;
The above script groups your data by name and identifies the highest employee number that is held against each name. It then joins that value to the initially loaded table so now you have a new field in your table called LatestEmployeeNumber.
The next step is to load data into a table where the EmployeeNum is equal to the LatestEmployeeNumber which is achieved by:
LatestEmployeeStatus:
Load
EmployeeNum as currentEmployeeNumber,
Username as currentUserName,
Active as currentStatus
Resident Data
Where EmployeeNum=LatestEmployeeNumber;
This will create the table you were aiming for:
You could then drop the original table if you no longer needed it using: Drop table Data;
So the complete script is:
Data:
Load * inline [
Active, EmployeeNum, Username
0, 20, AndrewWilliams
0, 21, FredJones
0, 22, PeterSmith
1, 23, SamDavis
1, 24, FredJones
1, 25, PeterSmith
];
left join (Data)
Load
Username,
Firstsortedvalue(EmployeeNum,-EmployeeNum) as LatestEmployeeNumber
Resident Data
Group By Username
;
LatestEmployeeStatusResult:
Load
EmployeeNum as currentEmployeeNumber,
Username as currentUserName,
Active as currentStatus
Resident Data
Where EmployeeNum=LatestEmployeeNumber;
Hope that helps.
Kind regards
Steve
Try something like
INPUT:
LOAD Active,
EmployeeNum,
Username
FROM
[https://community.qlik.com/thread/171191]
(html, codepage is 1252, embedded labels, table is @1);
RESULT:
LOAD Username as UsernameCheck,
*
RESIDENT INPUT
WHERE NOT EXISTS(UsernameCheck, Username)
ORDER BY Username, Active desc;
DROP TABLE INPUT;
DROP FIELD UsernameCheck;
Create a Straight Table
Dimensions
=Aggr(max(EmployeeNum),Username)
=Aggr(Max(Active),Username)
Expression
FirstSortedValue(Username,-Active)
Try with this code
TAB1:
LOAD * Inline [
Active, EmployeeNum, Username
0, 20, AndrewWilliams
0, 21, FredJones
0, 22, PeterSmith
1, 23, SamDavis
1, 24, FredJones
1, 25, PeterSmith
];
Left Join
LOAD '1' as flag, max(Active) as Active, Max(EmployeeNum) as EmployeeNum Resident TAB1 Group By Username;
Final: NoConcatenate
LOAD * Resident TAB1 Where flag=1;
DROP Table TAB1;
or
Create a Straight Table
Dimensions
=Aggr(max(EmployeeNum),Username)
=Aggr(Max(Active),Username)
Expression
Username
Hi Peter
This is a classic 'let us know what the most recent details against each of our employees is' scenario which is best achieved in script.
In your quest you actually have anomalous issues. For example, you want to know the most recent status against each employee, but each employee may no longer be active. Consequently you can't simply query your data for the current 'Active' employees, i.e select all where the active status is 1. Instead you are looking for the most recent entry against each employee name to determine the current status of each name. In your database ideally you would have an audit table that would show the dates on which each employee transitioned from active to inactive. However, you do not show that data so I will assume it does not exist. However, from your data it appears that each new employee gets a new employee number even if they have the same name as a previous employee so we can use that to achieve your desired result.
Firstly, let us consider what consistent information we have. In your data the consistent data is the Employee Name and the Active status. This is a risky strategy as it is prone to issues from spelling mistakes/entry variations in the name but let's run with it.
Firstly, I loaded a copy of your raw data:
Data:
Load * inline [
Active, EmployeeNum, Username
0, 20, AndrewWilliams
0, 21, FredJones
0, 22, PeterSmith
1, 23, SamDavis
1, 24, FredJones
1, 25, PeterSmith
];
I used an inline load for simplicity but I assume this would come from your database. No matter where it comes from the key point is that you have the data in the three fields listed in a table that resides in memory in QlikView.
As the employee number is the key identifier in your data we next need to identify the most recent identifier for each name that you have in your data. This is achieved by:
left join (Data)
Load
Username,
Firstsortedvalue(EmployeeNum,-EmployeeNum) as LatestEmployeeNumber
Resident Data
Group By Username
;
The above script groups your data by name and identifies the highest employee number that is held against each name. It then joins that value to the initially loaded table so now you have a new field in your table called LatestEmployeeNumber.
The next step is to load data into a table where the EmployeeNum is equal to the LatestEmployeeNumber which is achieved by:
LatestEmployeeStatus:
Load
EmployeeNum as currentEmployeeNumber,
Username as currentUserName,
Active as currentStatus
Resident Data
Where EmployeeNum=LatestEmployeeNumber;
This will create the table you were aiming for:
You could then drop the original table if you no longer needed it using: Drop table Data;
So the complete script is:
Data:
Load * inline [
Active, EmployeeNum, Username
0, 20, AndrewWilliams
0, 21, FredJones
0, 22, PeterSmith
1, 23, SamDavis
1, 24, FredJones
1, 25, PeterSmith
];
left join (Data)
Load
Username,
Firstsortedvalue(EmployeeNum,-EmployeeNum) as LatestEmployeeNumber
Resident Data
Group By Username
;
LatestEmployeeStatusResult:
Load
EmployeeNum as currentEmployeeNumber,
Username as currentUserName,
Active as currentStatus
Resident Data
Where EmployeeNum=LatestEmployeeNumber;
Hope that helps.
Kind regards
Steve
Thanks Steven,
I have been having problems with this for so long.
This worked a charm and also allowed me to identify the duplicates and create another table for the duplicates for IT to remove going forward.
Its like 2 for 1
Happy days.
Hi Peter
You're welcome. Glad it helped to resolve your issue.
Kind regards
Steve