Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Aggregate to Start & End Dates

Hi

I have source data of which this Load Inline is a small sample :

SourceData :
LOAD * INLINE [
    DimA, Date, DimB
    18524, 01/02/2010, MAALC
    18524, 13/02/2010, MAALC
    18524, 25/02/2010, MAALC
    18524, 26/02/2010, MAALC
    18524, 02/07/2012, MABCN
    18524, 02/07/2012, MABCN
    18524, 27/02/2013, MABCN
    18524, 28/02/2013, MABCN
    18524, 01/05/2013, MAIBZ
    18524, 02/05/2013, MAIBZ
    18524, 02/11/2013, MAIBZ
    18524, 03/11/2013, MAIBZ
    18524, 04/11/2013, MABCN
    18524, 05/11/2013, MABCN
    18524, 22/02/2014, MABCN
    18524, 23/02/2014, MABCN
    12345, 01/02/2010, MAALC
    12345, 13/02/2010, MAALC
    12345, 25/02/2010, MAALC
    12345, 26/02/2010, MAALC
    12345, 02/07/2012, MAIBZ
    12345, 02/07/2012, MAIBZ
    12345, 27/02/2013, MAIBZ
    12345, 28/02/2013, MAIBZ
    12345, 01/05/2013, MAALC
    12345, 02/05/2013, MAALC
    12345, 02/11/2013, MAALC
    12345, 03/11/2013, MAALC
    12345, 04/11/2013, MAIBZ
    12345, 05/11/2013, MAIBZ
    12345, 22/02/2014, MAIBZ
    12345, 23/02/2014, MAIBZ
];

From this I need to aggregate such that for each DimA value, it outputs a row for each time the DimB value changes that shows its StartDate and EndDate.  The joker is that DimB may be repeated for a DimA for a second date range.

I am sure I am explaining it badly, so below is the output I am after.

DimADimBStartDateEndDate
18524MAALC01/02/201026/02/2010
18524MABCN02/07/201228/02/2013
18524MAIBZ01/05/201303/11/2013
18524MABCN04/11/201323/02/2014
12345MAALC01/02/201026/02/2010
12345MAIBZ02/07/201228/02/2013
12345MAALC01/05/201302/11/2013
12345MAIBZ04/11/201323/02/2014

This has been driving me bananas all day and I have gone snow blind with sorting, grouping, peeking, min'ing, max'ing

So any suggestions would be very welcome.

Best Regards,     Bill

1 Solution

Accepted Solutions
Gysbert_Wassenaar

Try this:

SourceData :

LOAD DimA, Date, DimB,

if(previous(DimB) = DimB, peek('Counter'),rangesum(1,peek('Counter'))) as Counter

INLINE [

    DimA, Date, DimB

    18524, 01/02/2010, MAALC

    18524, 13/02/2010, MAALC

...etc

    12345, 22/02/2014, MAIBZ

    12345, 23/02/2014, MAIBZ

];

Result:

load DimA, DimB, Counter, min(Date) as StartDate, max(Date) as EndDate

Resident SourceData

group by DimA, DimB, Counter;

//optionally:

// Drop field Counter;

This does assume that the data is already ordered correctly. If it isn't you'll need an extra resident load to fix the order first.


talk is cheap, supply exceeds demand

View solution in original post

3 Replies
Gysbert_Wassenaar

Try this:

SourceData :

LOAD DimA, Date, DimB,

if(previous(DimB) = DimB, peek('Counter'),rangesum(1,peek('Counter'))) as Counter

INLINE [

    DimA, Date, DimB

    18524, 01/02/2010, MAALC

    18524, 13/02/2010, MAALC

...etc

    12345, 22/02/2014, MAIBZ

    12345, 23/02/2014, MAIBZ

];

Result:

load DimA, DimB, Counter, min(Date) as StartDate, max(Date) as EndDate

Resident SourceData

group by DimA, DimB, Counter;

//optionally:

// Drop field Counter;

This does assume that the data is already ordered correctly. If it isn't you'll need an extra resident load to fix the order first.


talk is cheap, supply exceeds demand
Anonymous
Not applicable
Author

Gysbert

Many thanks.  Your script line

     if(previous(DimB) = DimB, peek('Counter'),rangesum(1,peek('Counter'))) as Counter

is indeed cunning and I am pretty sure is what I needed.

I'll adapt your suggestion to my real data and advise, but I am optimistic.

Many Thanks,     Bill

Anonymous
Not applicable
Author

Gysbert

Yup, works perfectly.

Many, Many Thanks,     Bill