Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
MikeJohn
Contributor II
Contributor II

Create table out of different tables

Hi,

 

I do load two tables from two different excels.

 

Table1:
UserID,Entry1,Date

Table2:
Name,Entry2,Date

 

For both tables I do add a new column for month and year, so that it looks like this at the end

 

Table1:
UserID,Entry1,Date,Month(Date) as M1, Year(Date) as Y1

Table2:
Name,Entry2,Date,Month(Date) as M2, Year(Date) as Y2

 

Name and UserID are different but i do have a linkingtable(excel) which could look like this (not yet loaded)

TableLink:

UserID,Name

 

Now i do need to create a third table (during load if possible) which has the following entries

TableEfficiency:

Name,UserID,Month,Year,Efficiency

 

The entries should be calculated from the previous tables like this:

Name, UserId should be matching according to the linkingTable

Month= The month of the entry from table1 and table2

Year=The yearof the entry from table1 and table2

Efficiency=Sum(All_entry1_in_given_month_and_year) / NumberOfRows(entry2_in_given_month_and_year)

 

So basically onl 1 entry per user, per month, per year.

 

Here an example:

 

Table1: (After it was loaded)

UserID,Entry1,Date,Month(Date) as M1, Year(Date) as Y1

aaa,5,01.01.2019,January,2019
aaa,7,03.01.2019,January,2019
aaa,9,05.01.2019,January,2019
aaa,1,01.02.2019,February,2019
aaa,2,03.02.2019,February,2019
aaa,3,05.02.2019,February,2019

bbb,3,01.01.2019,January,2019
bbb,6,01.01.2019,January,2019
bbb,8,01.01.2019,January,2019
bbb,3,01.03.2019,March,2019
bbb,4,01.03.2019,March,2019
bbb,5,01.03.2019,March,2019

 

Table2:(After it was loaded)

Name,Entry2,Date,Month(Date) as M2, Year(Date) as Y2

Alex,string1,11.01.2019,January,2019
Alex,string2,13.01.2019,January,2019
Alex,string4,21.02.2019,February,2019
Alex,string5,23.02.2019,February,2019
Alex,string6,25.02.2019,February,2019

Bert,string23,15.01.2019,January,2019
Bert,string42,11.03.2019,March,2019
Bert,string12,16.03.2019,March,2019
Bert,string22,21.03.2019,March,2019

 

The linking table (or in the loading script):

Alex,aaa

Bert,bbb

 

The TableEfficiency should now look like this:

Name,UserID,Month,Year,Efficiency

Alex,aaa,January,2019,10.5 // Efficiency is (5+7+9)/2
Alex,aaa,February,2019,2 // Efficiency is (1+2+3)/3
Alex,aaa,March,2019,0 // No entries in at least one table for that month

Bert,bbb,January,2019,17 // Efficiency is (3+6+8)/1
Bert,bbb,February,2019,0 // No entries in at least one table for that month
Bert,bbb,March,2019,4 // Efficiency is (3+4+5)/3

 

And this should of course also work if entries are in more than just one year.

I think i could solve it on my own by just creating one big table with all entries, but since the original tables are a few thousand entries each i rather would not do that.

 

Any help is apreciated.

Labels (2)
1 Solution

Accepted Solutions
Saravanan_Desingh

One solution is:

Mapper:
Mapping
LOAD F2, F1 INLINE [
    F1, F2
    Alex, aaa
    Bert, bbb
];

tab1:
LOAD Name, UserID, Sum(Entry1) As Value1, Mon, Yr
Group By Name, UserID, Mon, Yr;
LOAD ApplyMap('Mapper',UserID) As Name,* INLINE [
    UserID,Entry1,Date1,Mon,Yr
    aaa,5,01.01.2019,January,2019
    aaa,7,03.01.2019,January,2019
    aaa,9,05.01.2019,January,2019
    aaa,1,01.02.2019,February,2019
    aaa,2,03.02.2019,February,2019
    aaa,3,05.02.2019,February,2019
    bbb,3,01.01.2019,January,2019
    bbb,6,01.01.2019,January,2019
    bbb,8,01.01.2019,January,2019
    bbb,3,01.03.2019,March,2019
    bbb,4,01.03.2019,March,2019
    bbb,5,01.03.2019,March,2019
];

Left Join(tab1)
tab2:
LOAD Name, Count(Entry2) As Value2, Mon, Yr
Group By Name, Mon, Yr;
LOAD * INLINE [
    Name,Entry2,Date2,Mon,Yr
    Alex,string1,11.01.2019,January,2019
    Alex,string2,13.01.2019,January,2019
    Alex,string4,21.02.2019,February,2019
    Alex,string5,23.02.2019,February,2019
    Alex,string6,25.02.2019,February,2019
    Bert,string23,15.01.2019,January,2019
    Bert,string42,11.03.2019,March,2019
    Bert,string12,16.03.2019,March,2019
    Bert,string22,21.03.2019,March,2019
];

tab3:
LOAD *, Evaluate(Value1/Value2) As Efficiency
Resident tab1;

Drop Table tab1;

commQV14.PNG

View solution in original post

1 Reply
Saravanan_Desingh

One solution is:

Mapper:
Mapping
LOAD F2, F1 INLINE [
    F1, F2
    Alex, aaa
    Bert, bbb
];

tab1:
LOAD Name, UserID, Sum(Entry1) As Value1, Mon, Yr
Group By Name, UserID, Mon, Yr;
LOAD ApplyMap('Mapper',UserID) As Name,* INLINE [
    UserID,Entry1,Date1,Mon,Yr
    aaa,5,01.01.2019,January,2019
    aaa,7,03.01.2019,January,2019
    aaa,9,05.01.2019,January,2019
    aaa,1,01.02.2019,February,2019
    aaa,2,03.02.2019,February,2019
    aaa,3,05.02.2019,February,2019
    bbb,3,01.01.2019,January,2019
    bbb,6,01.01.2019,January,2019
    bbb,8,01.01.2019,January,2019
    bbb,3,01.03.2019,March,2019
    bbb,4,01.03.2019,March,2019
    bbb,5,01.03.2019,March,2019
];

Left Join(tab1)
tab2:
LOAD Name, Count(Entry2) As Value2, Mon, Yr
Group By Name, Mon, Yr;
LOAD * INLINE [
    Name,Entry2,Date2,Mon,Yr
    Alex,string1,11.01.2019,January,2019
    Alex,string2,13.01.2019,January,2019
    Alex,string4,21.02.2019,February,2019
    Alex,string5,23.02.2019,February,2019
    Alex,string6,25.02.2019,February,2019
    Bert,string23,15.01.2019,January,2019
    Bert,string42,11.03.2019,March,2019
    Bert,string12,16.03.2019,March,2019
    Bert,string22,21.03.2019,March,2019
];

tab3:
LOAD *, Evaluate(Value1/Value2) As Efficiency
Resident tab1;

Drop Table tab1;

commQV14.PNG