Solved: Re: Overlapping intervals - Page 2 - Qlik Community

MindaugasBacius · ‎2017-01-02

Hello,

I am still face up a problem while removing overlapping intervals.

The data:

Client	Product	From	To
A	A	2015.01.01	2015.01.15
A	A	2015.01.15	2015.01.15
A	A	2015.01.10	2015.01.20
A	A	2015.02.01	2015.02.10
B	B	2015.01.01	2015.01.10
B	B	2015.01.15	2015.01.20
B	B	2015.01.13	2015.01.13
B	B	2015.01.13	2015.01.15
B	C	2015.01.01	2015.01.30
B	C	2015.01.10	2015.02.15
B	C	2015.02.20	2015.02.25
B	C	2015.02.19	2015.02.21
B	C	2015.02.23	2015.02.26

The result should be:

Client	Product	From	To
A	A	2015.01.01	2015.01.20
A	A	2015.02.01	2015.02.10
B	B	2015.01.01	2015.01.10
B	B	2015.01.13	2015.01.20
B	C	2015.01.01	2015.02.15
B	C	2015.02.19	2015.02.26

I need to remove overlapping intervals depending on Client and Product fields.

Thank you for your help!

cwolf · ‎2017-01-04

An other way is it to loop over the distinct intervals until they are clean. So it will not produce millions of discrete rows, but it will needs 4 up to 6 loops to get the clean intervals. But try it.

INPUTTAB:
LOAD
Hash128(Client,Product,From,To) as Key0,
*
;
LOAD * INLINE [
Client    Product    From    To
A    A    2015.01.01    2015.01.15    
...
](delimiter is '\t');

LOOPTAB:
LOAD Distinct
Key0,
AutoNumber(Hash128(Client,Product),'Key') as Key,
num(From) as From#,
num(To) as To#
Resident INPUTTAB;

DROP Fields From,To;

let nor1=NoOfRows('LOOPTAB');

for i= 1 to 100

    Trace *** Loop $(i) ***;
    
    let h=i-1;

    TEMP:
    NoConcatenate
    LOAD
    Hash128(Key,From#,To#) as Key$(i),
    *
    ;
    LOAD
    Key$(h),
    Key,
    If(Key=Previous(Key) and From#>=Previous(From#) and From#-1<=Previous(To#),Previous(From#),From#) as From#, 
    If(Key=Previous(Key) and To#<=Previous(To#) and To#>=Previous(From#),Previous(To#),To#) as To#
    Resident LOOPTAB
    Order By Key,From#,To# desc;
    
    DROP Table LOOPTAB; 
    
    LOOPTAB:
    LOAD Distinct
    Key$(i),
    Key,
    From#,
    To#
    Resident TEMP;
    
    let nor2=NoOfRows('LOOPTAB');
    
    if nor1=nor2 then
        Left Join(INPUTTAB)
        LOAD
        Key$(h),
        Date(From#) as From,
        Date(To#) as To
        Resident TEMP;
        
        DROP Field Key$(h);
        DROP Tables TEMP,LOOPTAB;
        EXIT For;
    else
        Left Join(INPUTTAB)
        LOAD
        Key$(h),
        Key$(i)
        Resident TEMP;
        
        DROP Field Key$(h);
        DROP Table TEMP;

        set nor1=$(nor2);
    end if;

next

OUTPUTTAB:
NoConcatenate
LOAD Distinct * Resident INPUTTAB;

DROP Table INPUTTAB;

MindaugasBacius · ‎2017-01-04

Using While it creates 500 million rows and get stuck.

MindaugasBacius · ‎2017-01-05

stalwar1,

maxgro did help in my other thread Unique intervals and the solution works great on one Client and one Product. Unfortunately, I have spend a reasonable amount of time trying to transform the script that suits for many Clients and Products but with no result. Would you might checking the script?

sunny_talwar · ‎2017-01-05

Checking

sunny_talwar · ‎2017-01-05

May be with these modifications

tmp:

LOAD Product,

From,

To,

Client

FROM Data.xls

(biff, embedded labels, table is Sheet1$);

tmp2:

LOAD *,

if(Product = Previous(Product) and Client = Previous(Client) and From = Peek('LastFrom') and To >= Peek('LastFrom') and To <= Peek('LastTo'), 'Remove', 'Leave') as Status,

if(Product = Previous(Product) and Client = Previous(Client) and From = Peek('LastFrom') and To >= Peek('LastFrom') and To <= Peek('LastTo'), Peek('LastFrom'), From) as LastFrom,

if(Product = Previous(Product) and Client = Previous(Client) and From = Peek('LastFrom') and To >= Peek('LastFrom') and To <= Peek('LastTo'), Peek('LastTo'), To) as LastTo

Resident tmp

Order By Client, Product, From desc, To desc;

DROP Table tmp;

tmp3:

NoConcatenate

LOAD Product, From, To, Client,

if(Product = Previous(Product) and Client = Previous(Client) and From >= Peek('LastFrom') and From <= Peek('LastTo') or To >= Peek('LastFrom') and To <= Peek('LastTo') or Status='Remove', 'Remove', 'Leave') as Status,

if(Product = Previous(Product) and Client = Previous(Client) and From >= Peek('LastFrom') and From <= Peek('LastTo') or To >= Peek('LastFrom') and To <= Peek('LastTo') or Status='Remove', Peek('LastFrom'), From) as LastFrom,

if(Product = Previous(Product) and Client = Previous(Client) and From >= Peek('LastFrom') and From <= Peek('LastTo') or To >= Peek('LastFrom') and To <= Peek('LastTo') or Status='Remove', Peek('LastTo'), To) as LastTo

Resident tmp2

Order By Client, Product, From, To;

DROP Table tmp2;

MarcoWedel · ‎2017-01-06

Hi,

I tried for a different method by identifying overlapping intervals using the intervalmatch prefix and then clustering those intervals using the hierarchy prefix.

Although the result is a bit lengthy and not performing as well as hoped, I want to post it nevertheless:

tabInput:
LOAD RecNo() as ID,
    AutoNumberHash128(Client,Product) as ClntProdID,
    From as Date, *  
FROM [https://community.qlik.com/thread/244866] (html, codepage is 1252, embedded labels, table is @1);

tabIntervals:
IntervalMatch (Date, ClntProdID)
LOAD From,
    To,
    ClntProdID
Resident tabInput;
  
Left Join (tabIntervals)
LOAD From,
    To,
    ClntProdID,
    ID as ParentID
Resident tabInput;

Left Join (tabIntervals)
LOAD Date,
    ClntProdID,
    ID
Resident tabInput;

Left Join (tabIntervals)
LOAD ID,
    Min(From) as MinFrom
Resident tabIntervals
Group By ID;

Left Join (tabInput)
LOAD ID,
    Min(ParentID) as ParentID
Resident tabIntervals
Where From = MinFrom
Group By ID;

DROP Table tabIntervals;

tabOutput:
Hierarchy (ChildID, ParentID, ID)
LOAD Client,
    Product,
    From,
    To,
    ID,
    ID as ChildID,
    ParentID
Resident tabInput;

DROP Table tabInput;

Left Join (tabOutput)
LOAD Distinct
    ID1,
    AutoNumber(ID1) as ClusterID
Resident tabOutput;

tabClusters:
LOAD ClusterID,
    Date(Min(From)) as ClusterFrom,
    Date(Max(To)) as ClusterTo
Resident tabOutput
Group By ClusterID;

hope this helps

regards

Marco

Gestion-PSD · ‎2022-05-20

Hi @MarcoWedel

Your apporach (with a few adjustments) works for me.

I'd the same problem witk overlappings time periods and I've gone mad looking for a solution.

The post marked as solution on this thread doesn't work. But yours does.

Unfortunatelly I cannot mark your post as a solution.

Thank you.

Best regards,

Alonso Torres.

Overlapping intervals

QlikView App Dev