Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
MindaugasBacius
Partner - Specialist III
Partner - Specialist III

Overlapping intervals

Hello,

I am still face up a problem while removing overlapping intervals.

The data:

ClientProductFromTo
AA2015.01.012015.01.15
AA2015.01.152015.01.15
AA2015.01.102015.01.20
AA2015.02.012015.02.10
BB2015.01.012015.01.10
BB2015.01.152015.01.20
BB2015.01.132015.01.13
BB2015.01.132015.01.15
BC2015.01.012015.01.30
BC2015.01.102015.02.15
BC2015.02.202015.02.25
BC2015.02.192015.02.21
BC2015.02.232015.02.26

The result should be:

ClientProductFromTo
AA2015.01.012015.01.20
AA2015.02.012015.02.10
BB2015.01.012015.01.10
BB2015.01.132015.01.20
BC2015.01.012015.02.15
BC2015.02.192015.02.26

I need to remove overlapping intervals depending on Client and Product fields.

Thank you for your help!

1 Solution

Accepted Solutions
sunny_talwar

May be with these modifications

tmp:

LOAD Product,

    From,

    To,

    Client

FROM Data.xls

(biff, embedded labels, table is Sheet1$);

tmp2:

LOAD *,

  if(Product = Previous(Product) and Client = Previous(Client) and From = Peek('LastFrom') and To >= Peek('LastFrom') and To <= Peek('LastTo'), 'Remove', 'Leave') as Status,

  if(Product = Previous(Product) and Client = Previous(Client) and From = Peek('LastFrom') and To >= Peek('LastFrom') and To <= Peek('LastTo'), Peek('LastFrom'), From) as LastFrom,

  if(Product = Previous(Product) and Client = Previous(Client) and From = Peek('LastFrom') and To >= Peek('LastFrom') and To <= Peek('LastTo'), Peek('LastTo'), To) as LastTo

Resident tmp

Order By Client, Product, From desc, To desc;

DROP Table tmp;

tmp3:

NoConcatenate

LOAD Product, From, To, Client,

  if(Product = Previous(Product) and Client = Previous(Client) and From >= Peek('LastFrom') and From <= Peek('LastTo') or To >= Peek('LastFrom') and To <= Peek('LastTo') or Status='Remove', 'Remove', 'Leave') as Status,

  if(Product = Previous(Product) and Client = Previous(Client) and From >= Peek('LastFrom') and From <= Peek('LastTo') or To >= Peek('LastFrom') and To <= Peek('LastTo') or Status='Remove', Peek('LastFrom'), From) as LastFrom,

  if(Product = Previous(Product) and Client = Previous(Client) and From >= Peek('LastFrom') and From <= Peek('LastTo') or To >= Peek('LastFrom') and To <= Peek('LastTo') or Status='Remove', Peek('LastTo'), To) as LastTo

Resident tmp2

Order By Client, Product, From, To;

DROP Table tmp2;

View solution in original post

16 Replies
raajaswin
Creator III
Creator III

Have you tried this?

MindaugasBacius
Partner - Specialist III
Partner - Specialist III
Author

Unfortunately this do not work :

Screenshot_3.jpg

raajaswin
Creator III
Creator III

Check this, still i have doubt on this.

MindaugasBacius
Partner - Specialist III
Partner - Specialist III
Author

Not working.

cwolf
Creator III
Creator III

Hi Mindaugas,

try following:

// 1. Dissolving intervals into discrete rows

TEMP:

LOAD

Min(From) as MinFrom,

Max(To) as MaxTo

Resident TEST;

let vMinFrom=FieldValue('MinFrom',1);

let vMaxTo=FieldValue('MaxTo',1);

DROP Table TEMP;

DATES:

LOAD

RecNo()+$(vMinFrom)-1 as Date

AutoGenerate $(vMaxTo)-$(vMinFrom)+1;

TEMP:

IntervalMatch(Date)

LOAD

From,

To

Resident TEST;

Left Join (TEST)

LOAD Distinct * Resident TEMP;

DROP Fields From,To;

DROP Tables TEMP,DATES;

// 2. Creating new intervals

TEMP:

LOAD

*,

if(Client=Previous(Client) and Product=Previous(Product) and Date-Previous(Date)<2,Peek('No'),if(IsNull(Peek('No')),1,Peek('No')+1)) as No

Resident TEST

Order By Client,Product,Date;

DROP Table TEST;

Left Join(TEMP)

LOAD

No,

Date(Min(Date)) as From,

Date(Max(Date)) as To

Resident TEMP

Group by No;

DROP Field Date;

TEST:

NoConcatenate

LOAD Distinct * Resident TEMP;

DROP Table TEMP;

MindaugasBacius
Partner - Specialist III
Partner - Specialist III
Author

It works great on small scale.

But if I have period of 5 years and 1.000.000 distinct intervals the method struggles to do it's job.

Maybe you have any other ideas?

cwolf
Creator III
Creator III

The script can be optimized a little bit. See attachment.

MindaugasBacius
Partner - Specialist III
Partner - Specialist III
Author

Actually the problem appears in this part of the script as there are creating millions of rows.

TEMP:

IntervalMatch(Date)

LOAD

From,

To

Resident TEST;

Left Join (TEST)

LOAD Distinct * Resident TEMP;

sunny_talwar

Another (although similar) option

Table:

LOAD DISTINCT Client,

  Product,

  Date(From + IterNo() - 1) as Date

While From + IterNo() - 1 <= To;

LOAD * INLINE [

    Client, Product, From, To

    A, A, 2015.01.01, 2015.01.15

    A, A, 2015.01.15, 2015.01.15

    A, A, 2015.01.10, 2015.01.20

    A, A, 2015.02.01, 2015.02.10

    B, B, 2015.01.01, 2015.01.10

    B, B, 2015.01.15, 2015.01.20

    B, B, 2015.01.13, 2015.01.13

    B, B, 2015.01.13, 2015.01.15

    B, C, 2015.01.01, 2015.01.30

    B, C, 2015.01.10, 2015.02.15

    B, C, 2015.02.20, 2015.02.25

    B, C, 2015.02.19, 2015.02.21

    B, C, 2015.02.23, 2015.02.26

];

TempTable:

LOAD *,

  If(Client = Previous(Client) and Product = Previous(Product) and Date = Previous(Date + 1), Alt(Peek('SNo'), 1), RangeSum(Peek('SNo'), 1)) as SNo

Resident Table

Order By Client, Product, Date;

FinalTable:

LOAD Client,

  Product,

  SNo,

  Date(Min(Date)) as From,

  Date(Max(Date)) as To

Resident TempTable

Group By Client, Product, SNo;

DROP Table Table, TempTable;

Capture.PNG