Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a table with CustomerID, Contract Start Date and Contract End Date fields.
One Customer can have multiple following contracts.
I need to merge rows where the gap between an older and a newer contract is <= 3 days. Contracts with longer gap should not be merged. Here is an example:
Rows 2-4 and 5-6 should be merged. The result should look like this:
I've already tried to merge rows using resident load and Min/Max functions, but I have no idea how and where to define the condition of <=3 days. Any suggestions?
Thank You!
For this specific data the below script might work. But I think this might not work with your actual data. Would you be able to add more data to your sample?
Table:
LOAD * Inline [
RowNo, CustomerID, StartDate, EndDate
1, 1, 01.01.2008, 31.12.2008
2, 1, 05.01.2009, 31.12.2009
3, 1, 02.01.2010, 31.12.2010
4, 1, 01.01.2011, 21.12.2011
5, 2, 01.01.2012, 31.12.2012
6, 2, 01.01.2013, 31.12.2013
];
Temp:
LOAD *,
If(CustomerID = Peek('CustomerID'), Peek('StartDate')) - EndDate as Diff
Resident Table
Order By CustomerID, StartDate desc;
FinalTable:
LOAD CustomerID,
Date(Min(StartDate)) as StartDate,
Date(Max(EndDate)) as EndDate
Resident Temp
Where Diff <= 3
Group By CustomerID;
Concatenate(FinalTable)
LOAD CustomerID,
StartDate,
EndDate
Resident Temp
Where Diff > 3;
FinalFinalTable:
NoConcatenate
LOAD RowNo() as RowNo,
*
Resident FinalTable;
DROP Tables Table, Temp, FinalTable;
For this specific data the below script might work. But I think this might not work with your actual data. Would you be able to add more data to your sample?
Table:
LOAD * Inline [
RowNo, CustomerID, StartDate, EndDate
1, 1, 01.01.2008, 31.12.2008
2, 1, 05.01.2009, 31.12.2009
3, 1, 02.01.2010, 31.12.2010
4, 1, 01.01.2011, 21.12.2011
5, 2, 01.01.2012, 31.12.2012
6, 2, 01.01.2013, 31.12.2013
];
Temp:
LOAD *,
If(CustomerID = Peek('CustomerID'), Peek('StartDate')) - EndDate as Diff
Resident Table
Order By CustomerID, StartDate desc;
FinalTable:
LOAD CustomerID,
Date(Min(StartDate)) as StartDate,
Date(Max(EndDate)) as EndDate
Resident Temp
Where Diff <= 3
Group By CustomerID;
Concatenate(FinalTable)
LOAD CustomerID,
StartDate,
EndDate
Resident Temp
Where Diff > 3;
FinalFinalTable:
NoConcatenate
LOAD RowNo() as RowNo,
*
Resident FinalTable;
DROP Tables Table, Temp, FinalTable;
That's great! I've already figured out a similar solution, but mine has more steps.
Thank you Sunny T!