Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a table that contains a list of the transfers made for an order. Each order can have multiple transfers.
Incorrect transfers may occur, and they are marked with a flag.
the transfer following an incorrect transfer is always a correction of the transfer, but there can be multiple incorrect transfer in a row.
Here is an example dataset:
OrderID | Trans_Num | Start_Date-Time | End_Date-Time | KM | Flag |
1001 | 1 | 01/01/2024 08:00 | 01/01/2024 09:00 | 89 | 0 |
1001 | 2 | 01/01/2024 10:00 | 01/01/2024 11:30 | 75 | 1 |
1001 | 3 | 01/01/2024 11:40 | 01/01/2024 13:15 | 80 | 0 |
1001 | 4 | 01/02/2024 08:00 | 01/03/2024 10:07 | 300 | 0 |
1002 | 1 | 01/01/2024 09:15 | 01/01/2024 10:00 | 54 | 1 |
1002 | 2 | 01/02/2024 08:00 | 01/02/2024 10:09 | 134 | 0 |
1002 | 3 | 01/02/2024 08:16 | 01/02/2024 09:13 | 79 | 0 |
1002 | 4 | 01/02/2024 08:34 | 01/02/2024 09:02 | 82 | 1 |
1002 | 5 | 01/03/2024 07:45 | 01/03/2024 08:23 | 93 | 1 |
1002 | 6 | 01/03/2024 09:10 | 01/03/2024 10:00 | 95 | 0 |
1002 | 7 | 01/03/2024 09:15 | 01/03/2024 11:35 | 204 | 0 |
In the example there are 3 cases, each one marked in a different color.
For each case I need to combine into a single row:
This is the result that should be achieved:
OrderID | Trans_Num | Start_Date-Time | End_Date-Time | TotalKM | IncorrectTransNum |
1001 | 1 | 01/01/2024 08:00 | 01/01/2024 09:00 | 89 | 0 |
1001 | 3 | 01/01/2024 10:00 | 01/01/2024 13:15 | 155 | 1 |
1001 | 4 | 01/02/2024 08:00 | 01/03/2024 10:07 | 300 | 0 |
1002 | 2 | 01/01/2024 09:15 | 01/02/2024 10:09 | 188 | 1 |
1002 | 3 | 01/02/2024 08:16 | 01/02/2024 09:13 | 79 | 0 |
1002 | 6 | 01/02/2024 08:34 | 01/03/2024 10:00 | 270 | 2 |
1002 | 7 | 01/03/2024 09:15 | 01/03/2024 11:35 | 204 | 0 |
I don't even know how to approach this so I need your help.
Thanks,
Amit.
@Amit_B Try below
Data:
Load * Inline [
OrderID Trans_Num Start_Date-Time End_Date-Time KM Flag
1001 1 01/01/2024 08:00 01/01/2024 09:00 89 0
1001 2 01/01/2024 10:00 01/01/2024 11:30 75 1
1001 3 01/01/2024 11:40 01/01/2024 13:15 80 0
1001 4 01/02/2024 08:00 01/03/2024 10:07 300 0
1002 1 01/01/2024 09:15 01/01/2024 10:00 54 1
1002 2 01/02/2024 08:00 01/02/2024 10:09 134 0
1002 3 01/02/2024 08:16 01/02/2024 09:13 79 0
1002 4 01/02/2024 08:34 01/02/2024 09:02 82 1
1002 5 01/03/2024 07:45 01/03/2024 08:23 93 1
1002 6 01/03/2024 09:10 01/03/2024 10:00 95 0
1002 7 01/03/2024 09:15 01/03/2024 11:35 204 0 ](delimiter is '\t');
New:
Load *,
if(OrderID=Previous(OrderID) and Previous(Flag)=1, RangeSum(Flag,Peek('IncorrectTransfers')),Flag) as IncorrectTransfers,
if(OrderID=Previous(OrderID) and Previous(Flag)=1, RangeSum(KM,Peek('TotalKM')),KM) as TotalKM,
if(OrderID=Previous(OrderID) and Previous(Flag)=1, Peek('Start_Date-Time-new'),[Start_Date-Time]) as [Start_Date-Time-new]
Resident Data
Order by OrderID,Trans_Num;
Drop Table Data;
Final:
NoConcatenate
Load *
Resident New
where Flag=0;
Drop Table New;
Drop Fields [Start_Date-Time],Flag,KM;
Note: if Trans_Num is not sequential then use Start Date Time In Order by statement in script
@Amit_B Try below
Data:
Load * Inline [
OrderID Trans_Num Start_Date-Time End_Date-Time KM Flag
1001 1 01/01/2024 08:00 01/01/2024 09:00 89 0
1001 2 01/01/2024 10:00 01/01/2024 11:30 75 1
1001 3 01/01/2024 11:40 01/01/2024 13:15 80 0
1001 4 01/02/2024 08:00 01/03/2024 10:07 300 0
1002 1 01/01/2024 09:15 01/01/2024 10:00 54 1
1002 2 01/02/2024 08:00 01/02/2024 10:09 134 0
1002 3 01/02/2024 08:16 01/02/2024 09:13 79 0
1002 4 01/02/2024 08:34 01/02/2024 09:02 82 1
1002 5 01/03/2024 07:45 01/03/2024 08:23 93 1
1002 6 01/03/2024 09:10 01/03/2024 10:00 95 0
1002 7 01/03/2024 09:15 01/03/2024 11:35 204 0 ](delimiter is '\t');
New:
Load *,
if(OrderID=Previous(OrderID) and Previous(Flag)=1, RangeSum(Flag,Peek('IncorrectTransfers')),Flag) as IncorrectTransfers,
if(OrderID=Previous(OrderID) and Previous(Flag)=1, RangeSum(KM,Peek('TotalKM')),KM) as TotalKM,
if(OrderID=Previous(OrderID) and Previous(Flag)=1, Peek('Start_Date-Time-new'),[Start_Date-Time]) as [Start_Date-Time-new]
Resident Data
Order by OrderID,Trans_Num;
Drop Table Data;
Final:
NoConcatenate
Load *
Resident New
where Flag=0;
Drop Table New;
Drop Fields [Start_Date-Time],Flag,KM;
Note: if Trans_Num is not sequential then use Start Date Time In Order by statement in script
Thank you for the asnwer.
Due to a few cases, I need too take the minimum Start_Date-Time and the maximum End_Date-Time.
How would you change the script?
I tried this for minimum Start_Date-Time:
If(OrderID=Previous(OrderID) and Previous(Flag)=1,
If([Start_Date-Time] < Peek('Start_Date-Time-new'), Start_Date-Time, Peek('Start_Date-Time-new')),
Start_Date-Time) as Start_Date-Time-new
OrderID | Trans_Num | Start_Date-Time | End_Date-Time | KM | Flag |
1002 | 4 | 01/03/2024 07:45 | 01/03/2024 19:02 | 82 | 1 |
1002 | 5 | 01/02/2024 08:34 | 01/03/2024 08:23 | 93 | 1 |
1002 | 6 | 01/03/2024 09:10 | 01/03/2024 10:00 | 95 | 0 |
Create a Resident table
Max(End_Date-Time )
min(start)
max(transnum),orderid
group by orderid
Try this
tab1:
load *,
if(previous(Flag)=1,peek(Start_Date),"Start_Date-Time") as Start_Date,
if(Flag=0,"End_Date-Time",peek("End_Date-Time",RecNo(),'tab')) as End1
resident tab;
drop table tab;
tab2:
load *,
if(Flag=1,
if(End1=peek('End1',RecNo(),'tab1'),End1,peek('End1',RecNo(),'tab1')),"End_Date-Time") as End_Date
resident tab1;
drop table tab1;
drop fields End1,"Start_Date-Time","End_Date-Time";
@Amit_B change the order by like below. Then logic I provided should work
Order by OrderID,Start_Date-Time;