Subroutine to Create Data Model for From/To Date Selection

    It seems common to need to select an arbitrary date range by simply selecting a From and To date, such as with calendar objects. This seems to usually be handled through the use of variables either triggering selections on the main date field or through the use of set analysis referring to those variables.

     

    Both approaches work and have their merits, but I tend to prefer data model solutions where practical. Henric Cronström and I'm sure others have pointed out that it could be handled in the data model through the use of two AsOf tables, one pointing forward in time and one pointing backwards. The problem with this is that the number of rows required is the square of the number of days involved. That's fine for shorter date ranges, but if you want to look at 30 years of data, for instance, that's more than 100 million rows. That's not practical.

     

    However, we can greatly reduce that number if we also add two date key tables, following the idea I'd suggested here. A date key is either a single date, a month, or a year. So to link to three years of dates, we only need three values. With that approach, 30 years of data requires fewer than 2 million rows.

     

    Here are a couple pictures to help show what's going on:

     

    Capture.PNGCapture2.PNG

     

    I wrote a subroutine to do all the hard work. Build your data model normally. Then call the subroutine, telling it the name of the existing date field, and what you want to call the new from and to date fields. It'll build the data structures to link everything together. Make selections in your new from and to date fields, and it'll link to all the dates in that range. I plugged it into one of my real applications, and it seemed to work properly without modification.

     

    There's probably some cleaner way to write the subroutine. I don't write many subroutines, and I didn't have a lot of time to think about how this might be done more cleanly. Hopefully it's bug free, but I easily could have made a mistake. I'll correct any that are found.

     

     

    SUB CreateFromToLinkage (DateField, FromDateField, ToDateField)

    [All Dates]:
    NOCONCATENATE
    LOAD date(fieldvalue('$(DateField)',recno())) as [$(DateField)]
    AUTOGENERATE fieldvaluecount('$(DateField)')
    ;
    [Min Max Dates]:
    NOCONCATENATE
    LOAD
    num(min([$(DateField)])) as [From]
    ,
    num(max([$(DateField)])) as [To]
    RESIDENT [All Dates]
    ;
    LET From = peek('From');
    LET To = peek('To');
    DROP TABLE [Min Max Dates];

    [Ranges]:
    NOCONCATENATE
    LOAD
    date($(From)+recno()-1) as [$(FromDateField)]
    ,
    $(To) as [$(ToDateField)]
    AUTOGENERATE $(To)-$(From)+1
    ;

    CALL LinkToDates (FromDateField, DateField, FromDateField, ToDateField);

    [Ranges]:
    NOCONCATENATE
    LOAD
    $(From) as [$(FromDateField)]
    ,
    date($(From)+recno()-1) as [$(ToDateField)]
    AUTOGENERATE $(To)-$(From)+1
    ;
    CALL LinkToDates (ToDateField, DateField, FromDateField, ToDateField);

    DROP TABLE [All Dates];
    SET From =;
    SET To =;

    END SUB

    SUB LinkToDates (Field, DateField, FromDateField, ToDateField)

    [$(Field)]:
    // Days
    NOCONCATENATE
    LOAD
    [$(Field)]
    ,
    text(date([$(FromDateField)]+iterno()-1,'YYYY-MM-DD')) as [$(Field) Date Key]
    RESIDENT Ranges
    WHILE   [$(FromDateField)]+iterno()-1 < rangemin(monthstart([$(FromDateField)],1),[$(ToDateField)]+1)
    ;
    // Months
    CONCATENATE ([$(Field)])
    LOAD
    [$(Field)]
    ,
    text(date(monthstart([$(FromDateField)],iterno()),'YYYY-MM')) as [$(Field) Date Key]
    RESIDENT Ranges
    WHILE   monthstart([$(FromDateField)],iterno()) < rangemin(yearstart([$(FromDateField)],1),monthstart([$(ToDateField)]))
    ;
    // Years
    CONCATENATE ([$(Field)])
    LOAD
    [$(Field)]
    ,
    text(date(yearstart([$(FromDateField)],iterno()),'YYYY')) as [$(Field) Date Key]
    RESIDENT Ranges
    WHILE   yearstart([$(FromDateField)],iterno()) < yearstart([$(ToDateField)])
    ;
    // Months
    CONCATENATE ([$(Field)])
    LOAD
    [$(Field)]
    ,
    text(date(monthstart([$(ToDateField)],-iterno()),'YYYY-MM')) as [$(Field) Date Key]
    RESIDENT Ranges
    WHILE yearstart([$(ToDateField)]) > yearstart([$(FromDateField)])
    AND   monthstart([$(ToDateField)],-iterno()) >= yearstart([$(ToDateField)])
    ;
    // Days
    CONCATENATE ([$(Field)])
    LOAD
    [$(Field)]
    ,
    text(date([$(ToDateField)]-iterno()+1,'YYYY-MM-DD')) as [$(Field) Date Key]
    RESIDENT Ranges
    WHILE monthstart([$(ToDateField)]) > monthstart([$(FromDateField)])
    AND   [$(ToDateField)]-iterno()+1 >= monthstart([$(ToDateField)])
    ;

    [$(Field) Date Keys]:
    LOAD text(fieldvalue('$(Field) Date Key',recno())) as [$(Field) Date Key]
    AUTOGENERATE fieldvaluecount('$(Field) Date Key')
    ;
    [All Date Keys]:
    // Days
    NOCONCATENATE
    LOAD
    [$(DateField)]
    ,
    text(date([$(DateField)],'YYYY-MM-DD')) as [$(Field) Date Key]
    RESIDENT [All Dates]
    ;
    // Months
    CONCATENATE ([All Date Keys])
    LOAD
    [$(DateField)]
    ,
    text(date([$(DateField)],'YYYY-MM')) as [$(Field) Date Key]
    RESIDENT [All Dates]
    ;
    // Years
    CONCATENATE ([All Date Keys])
    LOAD
    [$(DateField)]
    ,
    text(date([$(DateField)],'YYYY')) as [$(Field) Date Key]
    RESIDENT [All Dates]
    ;
    LEFT JOIN ([$(Field) Date Keys])
    LOAD *
    RESIDENT [All Date Keys]
    ;
    DROP TABLES
    [Ranges]
    ,[All Date Keys]
    ;
    END SUB