Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I am trying to load in historical employee location data (which location they worked at and for how long) but I only want to pull in the most recent record for everyone. Our system flags records as the 'current record' so it's easy to find those, but I want to find records that are no longer current and have a later start date than any other record for that unique ID. When a person resigns, their location data is retained but their most recent record is no longer flagged as current - we want to only look at current records or the last current record for each PersonID.
RowNum | PersonID | StartDate | EndDate | CurrentFlag | LocationID |
---|---|---|---|---|---|
1 | 1 | 1/1/17 | null | 1 | Texas |
2 | 1 | 4/1/16 | 12/31/16 | 0 | Texas |
3 | 2 | 2/1/17 | 4/1/17 | 0 | Michigan |
4 | 2 | 1/1/17 | 1/31/17 | 0 | Missouri |
5 | 3 | 7/1/17 | null | 0 | Florida |
6 | 3 | 1/1/17 | 6/30/17 | 1 | Missouri |
Based on this data, we would want to pull rows 1, 3, and 6.
Am I better off trying to flag the data in the load using an if statement? I wasn't sure if this was a scenario to use "Group By PersonID".
Thank you,
Phil
May be try this
Table:
LOAD * INLINE [
RowNum, PersonID, StartDate, EndDate, CurrentFlag, LocationID
1, 1, 1/1/17, null, 1, Texas
2, 1, 4/1/16, 12/31/16, 0, Texas
3, 2, 2/1/17, 4/1/17, 0, Michigan
4, 2, 1/1/17, 1/31/17, 0, Missouri
5, 3, 7/1/17, null, 0, Florida
6, 3, 1/1/17, 6/30/17, 1, Missouri
];
FinalTable:
NoConcatenate
LOAD *
Resident Table
Where PersonID <> Previous(PersonID)
Order By PersonID, CurrentFlag desc, EndDate desc;
DROP Table Table;
So are we saying that if CurrentFlag = 1 for a particular PersonID, then pull the row with CurrentFlag = 1, if it isn't then pull the row where StartDate and EndDate are max?
That is correct, we want to pull one record for each PersonID given that criteria.
Thank you,
Phil
May be try this
Table:
LOAD * INLINE [
RowNum, PersonID, StartDate, EndDate, CurrentFlag, LocationID
1, 1, 1/1/17, null, 1, Texas
2, 1, 4/1/16, 12/31/16, 0, Texas
3, 2, 2/1/17, 4/1/17, 0, Michigan
4, 2, 1/1/17, 1/31/17, 0, Missouri
5, 3, 7/1/17, null, 0, Florida
6, 3, 1/1/17, 6/30/17, 1, Missouri
];
FinalTable:
NoConcatenate
LOAD *
Resident Table
Where PersonID <> Previous(PersonID)
Order By PersonID, CurrentFlag desc, EndDate desc;
DROP Table Table;
That seems to have done the trick thank you! I was trying to use max(StartDate) and FirstSortedValue and such with no luck. I'll have to look into Previous() and when to use it in loads.
Thanks Sunny!