Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
mrichman
Creator II
Creator II

How to group by with maxdate

Hi Community,

I have an issue with a group by. I would like to group by the following:

The maximum date of the underlying case is 2018-06-29. I want to group and add a column by the maximum date with the latest status. So in this case, the maximum date is 2018-06-29 and the latest status should be 2.

Example:

LOAD * INLINE [

    Client, Statusnr, Date1, MaxDate,

    A, 1, 2018-05-26, 2018-06-29

    A, 3, 2018-05-29, 2018-06-29

    A, 2, 2018-06-29, 2018-06-29

] (delimiter is ',');

Ideal result:

LOAD * INLINE [

    Client, Statusnr, Date1, MaxDate, LatestStatus

    A, 1, 2018-05-26, 2018-06-29, 2

    A, 3, 2018-05-29, 2018-06-29, 2

    A, 2, 2018-06-29, 2018-06-29, 2

] (delimiter is ',');

Thank you in advance

1 Solution

Accepted Solutions
daveamz
Partner - Creator III
Partner - Creator III

Hi,

Look at the FirstSortedValue function. In your case it will be something like this:

T:

LOAD *, 1 AS Dummy INLINE [

Client, Statusnr, Date1, MaxDate

A, 1, 2018-05-26, 2018-06-29

B, 3, 2018-05-29, 2018-06-29

C, 2, 2018-06-29, 2018-06-29

] (delimiter is ',');

LEFT JOIN (T)

LOAD Dummy,

     FirstSortedValue(Statusnr, -Date1) AS LatestStatus

RESIDENT T

GROUP BY Dummy

;

DROP FIELD Dummy;

Regards,

David

View solution in original post

9 Replies
daveamz
Partner - Creator III
Partner - Creator III

Hi,

Look at the FirstSortedValue function. In your case it will be something like this:

T:

LOAD *, 1 AS Dummy INLINE [

Client, Statusnr, Date1, MaxDate

A, 1, 2018-05-26, 2018-06-29

B, 3, 2018-05-29, 2018-06-29

C, 2, 2018-06-29, 2018-06-29

] (delimiter is ',');

LEFT JOIN (T)

LOAD Dummy,

     FirstSortedValue(Statusnr, -Date1) AS LatestStatus

RESIDENT T

GROUP BY Dummy

;

DROP FIELD Dummy;

Regards,

David

sunny_talwar

Or just this

T:

LOAD * INLINE [

    Client, Statusnr, Date1, MaxDate

    A, 1, 2018-05-26, 2018-06-29

    B, 3, 2018-05-29, 2018-06-29

    C, 2, 2018-06-29, 2018-06-29

];


Join (T)

LOAD FirstSortedValue(Statusnr, -Date1) AS LatestStatus

Resident T;

mrichman
Creator II
Creator II
Author

Thanks David,

What is the purpose of the Dummy? Does this minimize the datasize?

Thanks in advance.

sunny_talwar

I don't think Dummy is doing anything... if you look at the script below... it essentially gives the same result without needing Dummy field.

mrichman
Creator II
Creator II
Author

Thanks Sunny! I thought it would minimize the database size. During a group by my data load time increases. Haven't found any solution yet.

daveamz
Partner - Creator III
Partner - Creator III

'Dummy' is used in this case to group the whole data set by ('FirstSoredValue' is an aggregation function) and also as a key to link the result to all corresponding values (all in this case). If you have multiple entries for each client and you want to determine the status at that level, you can use 'Client' field instead.

Regards,

David

shiveshsingh
Master
Master

Plz try this

T:LOAD * INLINE [

    Client, Statusnr, Date1

   

    A, 1, 2018-05-26

   

    A, 3, 2018-05-29

   

    A, 2, 2018-06-29];

   

F:

load Client, Statusnr, Date(Date#(Date1,'YYYY-MM-DD'),'DD-MM-YYYY') as Date

Resident T;

drop table T;

inner join (F)

load  FirstSortedValue(Statusnr,-Date) as Latest_Status,date(max(Date)) as Max_Date

Resident F;

sunny_talwar

Here is one where you don't really need aggregation

T:

LOAD Client,

Statusnr,

Date(Date1) as Date1;

LOAD * INLINE [

    Client, Statusnr, Date1

    A, 1, 2018-05-26

    B, 3, 2018-05-29

    C, 2, 2018-06-29

];


MaxDate:

LOAD Max(MaxDate) as MaxDate;

LOAD FieldValue('Date1', RecNo()) as MaxDate

AutoGenerate FieldValueCount('Date1');


LET vMaxDate = Peek('MaxDate');


Join (T)

LOAD Statusnr AS LatestStatus

Resident T

Where Date1 = $(vMaxDate);

mrichman
Creator II
Creator II
Author

Thanks for the explanation.