Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello all,
I'm trying to join two tables but I'm having difficult because one of the tables have about 20 million lines (passengers table) and I'm creating more lines with my join. I have two tables: Passengers and Travels. I want to get the count of passengers in each travel and keep the travels table and drop the passengers table.
Travels:
Load * inline
[
Car,StartTime,EndTime, Line
1,01/01/2020 08:00:00, 01/01/2020 08:40:00, 100
1,01/01/2020 08:40:00, 01/01/2020 09:20:00, 100
1,01/01/2020 09:20:00, 01/01/2020 10:00:00, 100
2,01/01/2020 08:00:00, 01/01/2020 08:40:00, 101
2,01/01/2020 08:40:00, 01/01/2020 09:20:00, 101
2,01/01/2020 09:20:00, 01/01/2020 10:00:00, 101
];
Passengers:
Load * inline
[
Car,Time
1,01/01/2020 08:10:00
1,01/01/2020 08:50:00
1,01/01/2020 09:10:00
2,01/01/2020 08:10:00
2,01/01/2020 08:30:00
2,01/01/2020 09:30:00
];
The final table should look like:
Car | StartTime | EndTime | Line | Nr. Passengers |
1 | 01/01/2020 08:00 | 01/01/2020 08:40 | 100 | 1 |
1 | 01/01/2020 08:40 | 01/01/2020 09:20 | 100 | 2 |
1 | 01/01/2020 09:20 | 01/01/2020 10:00 | 100 | 0 |
2 | 01/01/2020 08:00 | 01/01/2020 08:40 | 101 | 2 |
2 | 01/01/2020 08:40 | 01/01/2020 09:20 | 101 | 0 |
2 | 01/01/2020 09:20 | 01/01/2020 10:00 | 101 | 1 |
Thanks in advance
I was able to get this from the script in the bottom.
Travels:
Load * inline
[
Car,StartTime,EndTime, Line
1,01/01/2020 08:00:00, 01/01/2020 08:40:00, 100
1,01/01/2020 08:40:00, 01/01/2020 09:20:00, 100
1,01/01/2020 09:20:00, 01/01/2020 10:00:00, 100
2,01/01/2020 08:00:00, 01/01/2020 08:40:00, 101
2,01/01/2020 08:40:00, 01/01/2020 09:20:00, 101
2,01/01/2020 09:20:00, 01/01/2020 10:00:00, 101
];
Passengers:
Load * inline
[
Car,Time
1,01/01/2020 08:10:00
1,01/01/2020 08:50:00
1,01/01/2020 09:10:00
2,01/01/2020 08:10:00
2,01/01/2020 08:30:00
2,01/01/2020 09:30:00
];
Left join
IntervalMatch (Time,Car)
LOAD StartTime, EndTime, Car
Resident Travels;
LEFT Join (Travels)
LOAD count(Car) as NoOfTravelers, StartTime,EndTime,Car
Resident Passengers
Group by StartTime,EndTime,Car
;
Drop table Passengers;
I was able to get this from the script in the bottom.
Travels:
Load * inline
[
Car,StartTime,EndTime, Line
1,01/01/2020 08:00:00, 01/01/2020 08:40:00, 100
1,01/01/2020 08:40:00, 01/01/2020 09:20:00, 100
1,01/01/2020 09:20:00, 01/01/2020 10:00:00, 100
2,01/01/2020 08:00:00, 01/01/2020 08:40:00, 101
2,01/01/2020 08:40:00, 01/01/2020 09:20:00, 101
2,01/01/2020 09:20:00, 01/01/2020 10:00:00, 101
];
Passengers:
Load * inline
[
Car,Time
1,01/01/2020 08:10:00
1,01/01/2020 08:50:00
1,01/01/2020 09:10:00
2,01/01/2020 08:10:00
2,01/01/2020 08:30:00
2,01/01/2020 09:30:00
];
Left join
IntervalMatch (Time,Car)
LOAD StartTime, EndTime, Car
Resident Travels;
LEFT Join (Travels)
LOAD count(Car) as NoOfTravelers, StartTime,EndTime,Car
Resident Passengers
Group by StartTime,EndTime,Car
;
Drop table Passengers;
Vegar, it worked great! Thank you very much!