Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I do load two tables from two different excels.
Table1:
UserID,Entry1,Date
Table2:
Name,Entry2,Date
For both tables I do add a new column for month and year, so that it looks like this at the end
Table1:
UserID,Entry1,Date,Month(Date) as M1, Year(Date) as Y1
Table2:
Name,Entry2,Date,Month(Date) as M2, Year(Date) as Y2
Name and UserID are different but i do have a linkingtable(excel) which could look like this (not yet loaded)
TableLink:
UserID,Name
Now i do need to create a third table (during load if possible) which has the following entries
TableEfficiency:
Name,UserID,Month,Year,Efficiency
The entries should be calculated from the previous tables like this:
Name, UserId should be matching according to the linkingTable
Month= The month of the entry from table1 and table2
Year=The yearof the entry from table1 and table2
Efficiency=Sum(All_entry1_in_given_month_and_year) / NumberOfRows(entry2_in_given_month_and_year)
So basically onl 1 entry per user, per month, per year.
Here an example:
Table1: (After it was loaded)
UserID,Entry1,Date,Month(Date) as M1, Year(Date) as Y1
aaa,5,01.01.2019,January,2019
aaa,7,03.01.2019,January,2019
aaa,9,05.01.2019,January,2019
aaa,1,01.02.2019,February,2019
aaa,2,03.02.2019,February,2019
aaa,3,05.02.2019,February,2019
bbb,3,01.01.2019,January,2019
bbb,6,01.01.2019,January,2019
bbb,8,01.01.2019,January,2019
bbb,3,01.03.2019,March,2019
bbb,4,01.03.2019,March,2019
bbb,5,01.03.2019,March,2019
Table2:(After it was loaded)
Name,Entry2,Date,Month(Date) as M2, Year(Date) as Y2
Alex,string1,11.01.2019,January,2019
Alex,string2,13.01.2019,January,2019
Alex,string4,21.02.2019,February,2019
Alex,string5,23.02.2019,February,2019
Alex,string6,25.02.2019,February,2019
Bert,string23,15.01.2019,January,2019
Bert,string42,11.03.2019,March,2019
Bert,string12,16.03.2019,March,2019
Bert,string22,21.03.2019,March,2019
The linking table (or in the loading script):
Alex,aaa
Bert,bbb
The TableEfficiency should now look like this:
Name,UserID,Month,Year,Efficiency
Alex,aaa,January,2019,10.5 // Efficiency is (5+7+9)/2
Alex,aaa,February,2019,2 // Efficiency is (1+2+3)/3
Alex,aaa,March,2019,0 // No entries in at least one table for that month
Bert,bbb,January,2019,17 // Efficiency is (3+6+8)/1
Bert,bbb,February,2019,0 // No entries in at least one table for that month
Bert,bbb,March,2019,4 // Efficiency is (3+4+5)/3
And this should of course also work if entries are in more than just one year.
I think i could solve it on my own by just creating one big table with all entries, but since the original tables are a few thousand entries each i rather would not do that.
Any help is apreciated.
One solution is:
Mapper:
Mapping
LOAD F2, F1 INLINE [
F1, F2
Alex, aaa
Bert, bbb
];
tab1:
LOAD Name, UserID, Sum(Entry1) As Value1, Mon, Yr
Group By Name, UserID, Mon, Yr;
LOAD ApplyMap('Mapper',UserID) As Name,* INLINE [
UserID,Entry1,Date1,Mon,Yr
aaa,5,01.01.2019,January,2019
aaa,7,03.01.2019,January,2019
aaa,9,05.01.2019,January,2019
aaa,1,01.02.2019,February,2019
aaa,2,03.02.2019,February,2019
aaa,3,05.02.2019,February,2019
bbb,3,01.01.2019,January,2019
bbb,6,01.01.2019,January,2019
bbb,8,01.01.2019,January,2019
bbb,3,01.03.2019,March,2019
bbb,4,01.03.2019,March,2019
bbb,5,01.03.2019,March,2019
];
Left Join(tab1)
tab2:
LOAD Name, Count(Entry2) As Value2, Mon, Yr
Group By Name, Mon, Yr;
LOAD * INLINE [
Name,Entry2,Date2,Mon,Yr
Alex,string1,11.01.2019,January,2019
Alex,string2,13.01.2019,January,2019
Alex,string4,21.02.2019,February,2019
Alex,string5,23.02.2019,February,2019
Alex,string6,25.02.2019,February,2019
Bert,string23,15.01.2019,January,2019
Bert,string42,11.03.2019,March,2019
Bert,string12,16.03.2019,March,2019
Bert,string22,21.03.2019,March,2019
];
tab3:
LOAD *, Evaluate(Value1/Value2) As Efficiency
Resident tab1;
Drop Table tab1;
One solution is:
Mapper:
Mapping
LOAD F2, F1 INLINE [
F1, F2
Alex, aaa
Bert, bbb
];
tab1:
LOAD Name, UserID, Sum(Entry1) As Value1, Mon, Yr
Group By Name, UserID, Mon, Yr;
LOAD ApplyMap('Mapper',UserID) As Name,* INLINE [
UserID,Entry1,Date1,Mon,Yr
aaa,5,01.01.2019,January,2019
aaa,7,03.01.2019,January,2019
aaa,9,05.01.2019,January,2019
aaa,1,01.02.2019,February,2019
aaa,2,03.02.2019,February,2019
aaa,3,05.02.2019,February,2019
bbb,3,01.01.2019,January,2019
bbb,6,01.01.2019,January,2019
bbb,8,01.01.2019,January,2019
bbb,3,01.03.2019,March,2019
bbb,4,01.03.2019,March,2019
bbb,5,01.03.2019,March,2019
];
Left Join(tab1)
tab2:
LOAD Name, Count(Entry2) As Value2, Mon, Yr
Group By Name, Mon, Yr;
LOAD * INLINE [
Name,Entry2,Date2,Mon,Yr
Alex,string1,11.01.2019,January,2019
Alex,string2,13.01.2019,January,2019
Alex,string4,21.02.2019,February,2019
Alex,string5,23.02.2019,February,2019
Alex,string6,25.02.2019,February,2019
Bert,string23,15.01.2019,January,2019
Bert,string42,11.03.2019,March,2019
Bert,string12,16.03.2019,March,2019
Bert,string22,21.03.2019,March,2019
];
tab3:
LOAD *, Evaluate(Value1/Value2) As Efficiency
Resident tab1;
Drop Table tab1;