Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
MindaugasBacius
Partner - Specialist III
Partner - Specialist III

Overlapping intervals

Hello,

I am still face up a problem while removing overlapping intervals.

The data:

ClientProductFromTo
AA2015.01.012015.01.15
AA2015.01.152015.01.15
AA2015.01.102015.01.20
AA2015.02.012015.02.10
BB2015.01.012015.01.10
BB2015.01.152015.01.20
BB2015.01.132015.01.13
BB2015.01.132015.01.15
BC2015.01.012015.01.30
BC2015.01.102015.02.15
BC2015.02.202015.02.25
BC2015.02.192015.02.21
BC2015.02.232015.02.26

The result should be:

ClientProductFromTo
AA2015.01.012015.01.20
AA2015.02.012015.02.10
BB2015.01.012015.01.10
BB2015.01.132015.01.20
BC2015.01.012015.02.15
BC2015.02.192015.02.26

I need to remove overlapping intervals depending on Client and Product fields.

Thank you for your help!

1 Solution

Accepted Solutions
sunny_talwar

May be with these modifications

tmp:

LOAD Product,

    From,

    To,

    Client

FROM Data.xls

(biff, embedded labels, table is Sheet1$);

tmp2:

LOAD *,

  if(Product = Previous(Product) and Client = Previous(Client) and From = Peek('LastFrom') and To >= Peek('LastFrom') and To <= Peek('LastTo'), 'Remove', 'Leave') as Status,

  if(Product = Previous(Product) and Client = Previous(Client) and From = Peek('LastFrom') and To >= Peek('LastFrom') and To <= Peek('LastTo'), Peek('LastFrom'), From) as LastFrom,

  if(Product = Previous(Product) and Client = Previous(Client) and From = Peek('LastFrom') and To >= Peek('LastFrom') and To <= Peek('LastTo'), Peek('LastTo'), To) as LastTo

Resident tmp

Order By Client, Product, From desc, To desc;

DROP Table tmp;

tmp3:

NoConcatenate

LOAD Product, From, To, Client,

  if(Product = Previous(Product) and Client = Previous(Client) and From >= Peek('LastFrom') and From <= Peek('LastTo') or To >= Peek('LastFrom') and To <= Peek('LastTo') or Status='Remove', 'Remove', 'Leave') as Status,

  if(Product = Previous(Product) and Client = Previous(Client) and From >= Peek('LastFrom') and From <= Peek('LastTo') or To >= Peek('LastFrom') and To <= Peek('LastTo') or Status='Remove', Peek('LastFrom'), From) as LastFrom,

  if(Product = Previous(Product) and Client = Previous(Client) and From >= Peek('LastFrom') and From <= Peek('LastTo') or To >= Peek('LastFrom') and To <= Peek('LastTo') or Status='Remove', Peek('LastTo'), To) as LastTo

Resident tmp2

Order By Client, Product, From, To;

DROP Table tmp2;

View solution in original post

16 Replies
raajaswin
Creator III
Creator III

Have you tried this?

MindaugasBacius
Partner - Specialist III
Partner - Specialist III
Author

Unfortunately this do not work :

Screenshot_3.jpg

raajaswin
Creator III
Creator III

Check this, still i have doubt on this.

MindaugasBacius
Partner - Specialist III
Partner - Specialist III
Author

Not working.

cwolf
Creator III
Creator III

Hi Mindaugas,

try following:

// 1. Dissolving intervals into discrete rows

TEMP:

LOAD

Min(From) as MinFrom,

Max(To) as MaxTo

Resident TEST;

let vMinFrom=FieldValue('MinFrom',1);

let vMaxTo=FieldValue('MaxTo',1);

DROP Table TEMP;

DATES:

LOAD

RecNo()+$(vMinFrom)-1 as Date

AutoGenerate $(vMaxTo)-$(vMinFrom)+1;

TEMP:

IntervalMatch(Date)

LOAD

From,

To

Resident TEST;

Left Join (TEST)

LOAD Distinct * Resident TEMP;

DROP Fields From,To;

DROP Tables TEMP,DATES;

// 2. Creating new intervals

TEMP:

LOAD

*,

if(Client=Previous(Client) and Product=Previous(Product) and Date-Previous(Date)<2,Peek('No'),if(IsNull(Peek('No')),1,Peek('No')+1)) as No

Resident TEST

Order By Client,Product,Date;

DROP Table TEST;

Left Join(TEMP)

LOAD

No,

Date(Min(Date)) as From,

Date(Max(Date)) as To

Resident TEMP

Group by No;

DROP Field Date;

TEST:

NoConcatenate

LOAD Distinct * Resident TEMP;

DROP Table TEMP;

MindaugasBacius
Partner - Specialist III
Partner - Specialist III
Author

It works great on small scale.

But if I have period of 5 years and 1.000.000 distinct intervals the method struggles to do it's job.

Maybe you have any other ideas?

cwolf
Creator III
Creator III

The script can be optimized a little bit. See attachment.

MindaugasBacius
Partner - Specialist III
Partner - Specialist III
Author

Actually the problem appears in this part of the script as there are creating millions of rows.

TEMP:

IntervalMatch(Date)

LOAD

From,

To

Resident TEST;

Left Join (TEST)

LOAD Distinct * Resident TEMP;

sunny_talwar

Another (although similar) option

Table:

LOAD DISTINCT Client,

  Product,

  Date(From + IterNo() - 1) as Date

While From + IterNo() - 1 <= To;

LOAD * INLINE [

    Client, Product, From, To

    A, A, 2015.01.01, 2015.01.15

    A, A, 2015.01.15, 2015.01.15

    A, A, 2015.01.10, 2015.01.20

    A, A, 2015.02.01, 2015.02.10

    B, B, 2015.01.01, 2015.01.10

    B, B, 2015.01.15, 2015.01.20

    B, B, 2015.01.13, 2015.01.13

    B, B, 2015.01.13, 2015.01.15

    B, C, 2015.01.01, 2015.01.30

    B, C, 2015.01.10, 2015.02.15

    B, C, 2015.02.20, 2015.02.25

    B, C, 2015.02.19, 2015.02.21

    B, C, 2015.02.23, 2015.02.26

];

TempTable:

LOAD *,

  If(Client = Previous(Client) and Product = Previous(Product) and Date = Previous(Date + 1), Alt(Peek('SNo'), 1), RangeSum(Peek('SNo'), 1)) as SNo

Resident Table

Order By Client, Product, Date;

FinalTable:

LOAD Client,

  Product,

  SNo,

  Date(Min(Date)) as From,

  Date(Max(Date)) as To

Resident TempTable

Group By Client, Product, SNo;

DROP Table Table, TempTable;

Capture.PNG