Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
EDIT:
Hi People,
I have a change log of incidents with user and update DateTime.
Each incident has multiple records.
For each incident, I want to have only one record, with the first update.
So my final record includes the fields:
Incident No, First Update (DateTime) , User
The solution I found works, but not sure is optimal..
Would appreciate some feedback..
Thanks!
Hi, your current solution looks fine and should work as expected, but there are a couple of things that could be optimized.
Avoid unnecessary Resident Load: You don't need to load the data into TMP2_Incident_Audit twice. Instead, you can calculate FirstUpdatedOn and UserId for each IncidentID in one single load.
Min() and FirstSortedValue() together: There is a possibility of getting incorrect UserId if you have more than one UserId for the same minimum UpdatedOn timestamp for a given IncidentID. To mitigate this, you can concatenate UpdatedOn and UserId into a single field, and then use FirstSortedValue() to get the UserId for the first update.
Here's a possible alternative approach:
TMP2_Incident_Audit:
Load
IncidentID,
Timestamp(Min(UpdatedOn)) As FirstUpdatedOn,
FirstSortedValue(UserId & '|' & UpdatedOn, UpdatedOn) As FirstUserId
Resident TMP1_Incident_Audit
Group By IncidentID;
Final_Incident_Audit:
Load
IncidentID,
FirstUpdatedOn,
SubField(FirstUserId, '|', 1) As UserId
Resident TMP2_Incident_Audit;
Drop Table TMP2_Incident_Audit;
In this approach, the FirstUserId field in the TMP2_Incident_Audit table is a combination of UserId and UpdatedOn. In the Final_Incident_Audit table, we extract the UserId from FirstUserId using the SubField() function.
This solution should give you the desired result and it also eliminates the need for the inner join.
Regarts.
Hi, your current solution looks fine and should work as expected, but there are a couple of things that could be optimized.
Avoid unnecessary Resident Load: You don't need to load the data into TMP2_Incident_Audit twice. Instead, you can calculate FirstUpdatedOn and UserId for each IncidentID in one single load.
Min() and FirstSortedValue() together: There is a possibility of getting incorrect UserId if you have more than one UserId for the same minimum UpdatedOn timestamp for a given IncidentID. To mitigate this, you can concatenate UpdatedOn and UserId into a single field, and then use FirstSortedValue() to get the UserId for the first update.
Here's a possible alternative approach:
TMP2_Incident_Audit:
Load
IncidentID,
Timestamp(Min(UpdatedOn)) As FirstUpdatedOn,
FirstSortedValue(UserId & '|' & UpdatedOn, UpdatedOn) As FirstUserId
Resident TMP1_Incident_Audit
Group By IncidentID;
Final_Incident_Audit:
Load
IncidentID,
FirstUpdatedOn,
SubField(FirstUserId, '|', 1) As UserId
Resident TMP2_Incident_Audit;
Drop Table TMP2_Incident_Audit;
In this approach, the FirstUserId field in the TMP2_Incident_Audit table is a combination of UserId and UpdatedOn. In the Final_Incident_Audit table, we extract the UserId from FirstUserId using the SubField() function.
This solution should give you the desired result and it also eliminates the need for the inner join.
Regarts.
Hi @cristianj23a ,
I like it! thanks a lot!