Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
My file contains lines for people with multiple lines based on status.
For example:
ID | Name | Status | Country |
---|---|---|---|
1002 | Jack | Active | Netherlands |
1002 | Jack | Withdrawn | Belgium |
1002 | Jack | Withdrawn | Germany |
Some people only have 1 active line and no withdrawn lines, and some people will have both.
This file is then joined with multiple other tables.
My goal is:
If someone has a withdrawn status and an active status, only the active line should be imported.
If someone has no active line anymore, the withdrawn line should be imported.
If someone has only an active profile, this should be imported.
Any idea of how I can do this in my load editor? All tables are joined through an outer join.
For common interest I will share the solution I have found the be working for me.
Temp6:
Load *,
if((Previous("Global ID")="Global ID" and Previous("Employee Status ID")="Employee Status ID"),Null(), "Employee Status ID") as ScaleStart
resident EmployeeReport
;
ScaleMap4:
mapping load
AutoNumber("Global ID"), Max(ScaleStart) as MaxScaleStart
resident Temp6
group by "Global ID"
;
drop table Temp6;
FinalEMReport:
NoConcatenate
Load *,
if(ApplyMap('ScaleMap4',AutoNumber("Global ID"),Null())="Employee Status ID","EmpSt",Null()) as "Employment Status"
resident EmployeeReport
Where
ApplyMap('ScaleMap4',AutoNumber("Global ID"),NULL())=NUM("Employee Status ID")
;
drop table EmployeeReport;
You can make use of GROUP BY and the FirstSortedValue function like this example shows:
TESTDATA:
LOAD
ID,
Only(Name) AS Name,
MinString(Status) AS Status,
FirstSortedValue(DISTINCT Country,Ord(Left(Status,1))) AS Country
INLINE [
ID,Name,Status,Country
1002,Jack,Active,Netherlands
1002,Jack,Withdrawn,Belgium
1002,Jack,Withdrawn,Germany
1003,Barney,Withdrawn,Sweden
1003,Barney,Withdrawn,Albania
1004,Charlie,Active,Austria
]
GROUP BY ID;
Since the status is a text I use the first character and get the ascii value in the second parameter of the FirstSortedValue to get a numeric sort-weight as the function will not accept strings for a sort-weight.
Hi Petter,
Thank you very much for your response. The code makes sense to me, however when I apply it to my load script I get the message "invalid expression"
My load script consists of about 10 different source files all outer joined in the final table:
Load
"Global ID",
"Personal ID",
If(Len(Trim([Name])) > 0, [Name], [Name Movement]) as Name,
Gender,
Country,
"Employment Status",
Resident Employee_Report;
Drop table Employee_Report;
There are a lot more fields, but they are not applicable to this issue.
Essentially what I did what apply the same logic as you proposed, however I got the message saying "Invalid Expression"what is going wrong.
My source files exist of these and more fields, containing several people with their lines like in the example given above.
Any chance you might now what to do?
Thanks a lot for your time.
For common interest I will share the solution I have found the be working for me.
Temp6:
Load *,
if((Previous("Global ID")="Global ID" and Previous("Employee Status ID")="Employee Status ID"),Null(), "Employee Status ID") as ScaleStart
resident EmployeeReport
;
ScaleMap4:
mapping load
AutoNumber("Global ID"), Max(ScaleStart) as MaxScaleStart
resident Temp6
group by "Global ID"
;
drop table Temp6;
FinalEMReport:
NoConcatenate
Load *,
if(ApplyMap('ScaleMap4',AutoNumber("Global ID"),Null())="Employee Status ID","EmpSt",Null()) as "Employment Status"
resident EmployeeReport
Where
ApplyMap('ScaleMap4',AutoNumber("Global ID"),NULL())=NUM("Employee Status ID")
;
drop table EmployeeReport;