Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello I am trying to calculate the flight duration for the outbound portion of the trip and the inbound portion of the trip. A trip can have multiple segments on the oubound and inbound portions. I've attached some sample data and in it there is a sequence number for each segment and also a field to show if a segment is a connection on the trip. In the example below I am looking to calculate the leg duration (green column). I know I have to use a combination of the Trip Key, Flight Sequence and Connection and then sum the Segment Duration but not sure how to set that up with set analysis. How would one go about doing this? Is this an aggr situation? Do I need to do any additional data modeling work to facilitate? Any suggestions would be appreciated.
Trip Key | Flight Sequence | Departure Airport | Arrival Airport | Connection | Segment Duration | Leg Duration |
20210622AXU7830283861 | 1 | BHM | ATL | Y | 1.2 | |
20210622AXU7830283861 | 2 | ATL | CRW | 1.3 | 2.5 | |
20210622AXU7830283861 | 3 | CRW | ATL | Y | 1.5 | |
20210622AXU7830283861 | 4 | ATL | BHM | 0.9 | 2.4 |
@carlcimino try this
Data:
LOAD [Trip Key],
[Flight Sequence],
[Departure Airport],
[Arrival Airport],
Connection,
[Segment Duration]
FROM
[Sample Trip Data.xlsx]
(ooxml, embedded labels, table is Data);
T1:
NoConcatenate
LOAD *,
if([Flight Sequence]=1 and len(trim(Connection))=0,null(),
if(lower(trim(Connection))='y',RangeSum(1,Peek('connection_group')),
if(len(trim(Connection))=0,Peek('connection_group'),
))) as connection_group,
if(lower(trim(Connection))='y',1,
RangeSum(1,Peek('connection_sequence'))) as connection_sequence,
if([Trip Key]<>Previous([Trip Key]) or ([Trip Key]=Previous([Trip Key]) and lower(trim(Connection))='y' and previous(lower(trim(Connection)))<>'y'),
[Segment Duration], RangeSum([Segment Duration],Peek('Segment_lag_temp'))) as Segment_lag_temp
Resident Data
Order by [Trip Key], [Flight Sequence];
Left Join(T1)
LOAD [Trip Key],
connection_group,
max(connection_sequence) as Max_connection_sequence
Resident T1
Group by [Trip Key],
connection_group;
DROP Table Data;
Final:
LOAD *,
if(Max_connection_sequence=connection_sequence and lower(trim(Connection))<>'y' ,Segment_lag_temp,
if(len(trim(connection_group))=0,[Segment Duration],0)) as Segment_lag
Resident T1;
DROP Table T1;
DROP Fields Segment_lag_temp,Max_connection_sequence,connection_sequence,connection_group;
Hi,
It seems to me that you need a way of indicating that the leg is outbound or inbound, then you can simply add up the leg duration of each trip key-outbound/inbound combination
@carlcimino you might need to do scripting to do this
Data:
LOAD [Trip Key],
[Flight Sequence],
[Departure Airport],
[Arrival Airport],
Connection,
[Segment Duration]
FROM
[Sample Trip Data.xlsx]
(ooxml, embedded labels, table is Data);
T1:
LOAD *,
if([Trip Key]<>Previous([Trip Key]) or ([Trip Key]=Previous([Trip Key]) and lower(trim(Connection))='y' and previous(lower(trim(Connection)))<>'y'),
[Segment Duration], RangeSum([Segment Duration],Peek('Segment_lag_temp'))) as Segment_lag_temp
Resident Data
Order by [Trip Key], [Flight Sequence];
DROP Table Data;
Final:
LOAD *,
if(lower(trim(Connection))='y',0,Segment_lag_temp) as Segment_lag
Resident T1;
DROP Table T1;
DROP Field Segment_lag_temp;
Hi @Kushal_Chawda This looks to be almost what I was expecting. Where it does not look as expected is when there are no connections at all like the trip below which is just a there and back. In this case the Segment_Lag should be equal to the Segment Duration. But that would have to be an extra peek to see if there are any Connection = Y at all right? Where would that go in the T1 section or the Final? Or would you another step in between? Thanks again I appreciate the help.
Trip Key | Flight Sequence | Departure Airport | Arrival Airport | Connection | Segment Duration |
20210621AXM1100118806391 | 1 | MTY | MEX | 1.6 | |
20210621AXM1100118806391 | 2 | MEX | MTY | 1.8 |
@carlcimino try this
Data:
LOAD [Trip Key],
[Flight Sequence],
[Departure Airport],
[Arrival Airport],
Connection,
[Segment Duration]
FROM
[Sample Trip Data.xlsx]
(ooxml, embedded labels, table is Data);
T1:
NoConcatenate
LOAD *,
if([Flight Sequence]=1 and len(trim(Connection))=0,null(),
if(lower(trim(Connection))='y',RangeSum(1,Peek('connection_group')),
if(len(trim(Connection))=0,Peek('connection_group'),
))) as connection_group,
if(lower(trim(Connection))='y',1,
RangeSum(1,Peek('connection_sequence'))) as connection_sequence,
if([Trip Key]<>Previous([Trip Key]) or ([Trip Key]=Previous([Trip Key]) and lower(trim(Connection))='y' and previous(lower(trim(Connection)))<>'y'),
[Segment Duration], RangeSum([Segment Duration],Peek('Segment_lag_temp'))) as Segment_lag_temp
Resident Data
Order by [Trip Key], [Flight Sequence];
Left Join(T1)
LOAD [Trip Key],
connection_group,
max(connection_sequence) as Max_connection_sequence
Resident T1
Group by [Trip Key],
connection_group;
DROP Table Data;
Final:
LOAD *,
if(Max_connection_sequence=connection_sequence and lower(trim(Connection))<>'y' ,Segment_lag_temp,
if(len(trim(connection_group))=0,[Segment Duration],0)) as Segment_lag
Resident T1;
DROP Table T1;
DROP Fields Segment_lag_temp,Max_connection_sequence,connection_sequence,connection_group;
@Kushal_Chawda rock on. this appears to work as I expected. thank you very much for your help!
Glad that it worked