Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi everyone,
I am stuck with one issue which suppose to be easy to accomplish. I know this can be done with expressions but I must do that in a LOAD script.
I have similar kind of data:
Year | Position | Employee ID | Gender |
---|---|---|---|
2014 | Developer | XX1 | Female |
2014 | Accountant | XX2 | Female |
2014 | Developer | XX3 | Male |
2014 | Developer | XX4 | Male |
2015 | Developer | XX5 | Male |
2015 | Accountant | XX6 | Male |
2015 | Developer | XX7 | Male |
2015 | Developer | XX8 | Female |
2016 | Developer | XX9 | Female |
2016 | Accountant | XX10 | Male |
2016 | Accountant | XX11 | Male |
2016 | Developer | XX12 | Female |
2017 | Accountant | XX13 | Male |
2017 | Accountant | XX14 | Female |
2017 | Accountant | XX15 | Female |
2017 | Developer | XX16 | Male |
I am trying to load from this table (to drop it later) and to achieve a new table in a data model that would have dominated flag column (F- if there are more females in the position that year and M if there are more males in the position that year) . The result table suppose to look like this:
Year | Position | Flag |
---|---|---|
2014 | Developer | M |
2014 | Accountant | F |
2015 | Developer | M |
2015 | Accountant | M |
2016 | Developer | F |
2016 | Accountant | M |
2017 | Developer | M |
2017 | Accountant | F |
How to achieve that? I tried to do grouping by and aggregate but somewhat failed...
Your help will be greatly appreciated!
Try this
Table:
LOAD * INLINE [
Year, Position, Employee ID, Gender
2014, Developer, XX1, Female
2014, Accountant, XX2, Female
2014, Developer, XX3, Male
2014, Developer, XX4, Male
2015, Developer, XX5, Male
2015, Accountant, XX6, Male
2015, Developer, XX7, Male
2015, Developer, XX8, Female
2016, Developer, XX9, Female
2016, Accountant, XX10, Male
2016, Accountant, XX11, Male
2016, Developer, XX12, Female
2017, Accountant, XX13, Male
2017, Accountant, XX14, Female
2017, Accountant, XX15, Female
2017, Developer, XX16, Male
];
Left Join (Table)
LOAD Year,
Position,
If(Count(If(Gender = 'Female', [Employee ID])) > Count(If(Gender = 'Male', [Employee ID])), 'F', 'M') as Flag
Resident Table
Group By Year, Position;
Try this
Table:
LOAD * INLINE [
Year, Position, Employee ID, Gender
2014, Developer, XX1, Female
2014, Accountant, XX2, Female
2014, Developer, XX3, Male
2014, Developer, XX4, Male
2015, Developer, XX5, Male
2015, Accountant, XX6, Male
2015, Developer, XX7, Male
2015, Developer, XX8, Female
2016, Developer, XX9, Female
2016, Accountant, XX10, Male
2016, Accountant, XX11, Male
2016, Developer, XX12, Female
2017, Accountant, XX13, Male
2017, Accountant, XX14, Female
2017, Accountant, XX15, Female
2017, Developer, XX16, Male
];
Left Join (Table)
LOAD Year,
Position,
If(Count(If(Gender = 'Female', [Employee ID])) > Count(If(Gender = 'Male', [Employee ID])), 'F', 'M') as Flag
Resident Table
Group By Year, Position;
bro, you are really very quick. Nice solution.
thank you
Thanks for a quick solution Sunny!