Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Max date with apply map

Hello;

I am working on a set of data containing multiple dates.  The goal is for each item to define what the maximum date is.

I am using therefore the apply map function.

Could you please let me know if this is the most efficient methodology to do so.

Many thanks in advance for your help and have a great day.

Kristel.

Attached you will find the spreadsheet used as an example and the qvd file.

Hope you will be able to read this is.

If not here is the script;

Regiongrouping:

mapping load

  Region,

     (max(orderdate))

FROM

(biff, embedded labels)

group by Region;

OderData:

LOAD Region,

     orderdate,

     applyMap ('Regiongrouping', Region) as [Max oderdate],

     value,

     comment

FROM

(biff, embedded labels);

To make sure the load is focussing on the max date only we could close with:

Where orderdate=[Max oderdate];

But is it the best way???

6 Replies
Anonymous
Not applicable
Author

I'd do it a little bit simpler, with one time access to the source data and no applymap:

OderData:
LOAD
     Region,
     orderdate as [Max oderdate],
     value,
     comment
FROM (biff, embedded labels);

//

INNER JOIN (OderData) LOAD
    Region,
     max(orderdate) as [Max oderdate]
RESIDENT OderData
GROUP BY Region;

Not applicable
Author

Another possibility would be to do this:

OrderData:

LOAD Region,

     orderdate,

     Region&'-'&date(orderdate,'YYYYMMDD') as [RegionIndex],

     value,

     comment

FROM

(biff, embedded labels);

RegionLast:

LOAD Region,

     max(orderdate) as [Max order date]

Resident OrderData

Group by Region;

RegionLast1:

load

Region&'-'&date([Max order date],'YYYYMMDD') as RegionIndex,

'1' as Region_LastFlag

Resident RegionLast;

DROP Table RegionLast;

left join (OrderData)

load

  RegionIndex,

  Region_LastFlag as Region_IsLast

resident RegionLast1;

drop table RegionLast1;

RegionDetailsLast:

load

  Region,

     orderdate,

     value,

     Region_IsLast,

     comment

resident OrderData

where Region_IsLast = '1';

drop table OrderData;

What do you think?

And now when you have different scenario, how do you evaluate them on efficiency?

Thanks.

Kristel

Not applicable
Author

Hello Kristel,

Instead of doing it so complex,

just do one thing.Use the aggr() function.

No need to use any mapping load.Use a straight table .add one more calculated dimension.in that use the aggr() function.In the expression just write any 1 and then hide the expression.You will find your desired answer.

the aggr() function will be like       =aggr(max(orderdate),Region)

Capture.PNG

Hope u understood.

Anonymous
Not applicable
Author

Kristel,

I don't see why do you need a long and complex script while you can have it short and simple.

Biren Agrawalla,
As a rule, it is better to have actual fields rather than calculated dimensions - same result but better performance.  Still your solution makes sense if the max date is not the absolute max date per Region but within current date selection.  Even in this case, I'd rather move this expression from calculated dimension to expression.  And, in expression, no need to aggr() because Region is a dimension already, simply:
max(orderdate)


jaimeaguilar
Partner - Specialist II
Partner - Specialist II

Hi,

the aggr() approach is a fast solution to get results on the fly. However, as data volume increases, having a calculated dimension can become so heavy for your application, impacting in performance times.  So if you have access to script is better to make as much pre-calculations as you can in script,

regards

Not applicable
Author


Hi Micheal,

Thanks for the solution.But I know this is not a properway of doing as we have many other ways as well.

But I just posted my thought because this can also be a way of solving.

Its better if we will come to know about as many possible types of solutions for a problem.

Thanks

Regards,

Biren