Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
egma
Contributor II
Contributor II

how to aggregate by the last day of the month available in the dataset

Hi guys, 

I have a dataset with3 columns:

¦ date ¦ dataset type ¦ number of datasets ¦

date: date format dd/mm/yyyy

dataset type: it's basically text

number of datasets: integer

Here's an example of dataset:

 

sample of the dataset.png

It basically counts everyday the number of dataset per dataset format (pdf, zip, etc). There are missing data for certains days, I mean, in some cases there are days missing like for instance the last day having data in August 2016 was the 29th of August.

I need to sum up the total number of datasets for only the last day appearing each month. I can't use "MonthEnd" beacause I'm not certain that this date exists for every month of my dataset. So basically I need:

1. detect the last day appearing every month of the year

2. sum up all the "number of datasets" for this date.

3. This aggregation becomes the number total of datasets for the current month

For instance:

July 2016 --> 20 000

August 2016 --> 21 000

guys, I haven found  a suitable solution so far and I'm stuck. So, I'll really really appreciate your advice.

Thanks,

 

 

 

2 Replies
dwforest
Specialist II
Specialist II

create a master calendar that has a dimension for Month

Then use an aggr and firstsortedvalue to get last value; something like:

Aggr(max(firstsortedvalue ([number of datasets], -date),Month)

egma
Contributor II
Contributor II
Author

Thanks Dwforest for your hint, I could create a master calendar and then I tried to aggregate the "number of datasets" columnd and  I got this:

sample of aggregation result.png

I tried to select only the last available day for each month but it still not working. The aggregation I used above is:

= Aggr(Sum([number of datasets]), [dayNum], [monthNum], [YearNum])

Any idea how to pick only the row with the last day in each month?

 

Thanks