Is it Pre or Post Ceding Load?

    What happens when you load a raw data element that you need to do some work on ... but you also need to use that modified field value multiple times afterwards?

     

    Well you could do the Load statement

    Table:

    Load

    Field1,

    Field2,

    Function1(Function2(Function3( Field3) )) as [Modified Version of Field 3],

    Field4

    from ....;

     

    Then do a self join back to the table to add the other fields

    Join (Table)

    Load Field1,

    Function4([Modified Version of Field3]) as Field5

    ....;

    That's easy just time consuming. Or you can simply redo all of the work you did over and over and over to create Field5, 6, 7, 8, 9 and 10. Easy for you to copy and paste ... but then you use lots of CPU for reloads and you have to continually maintain any changes you make to what you edit.

     

    The concept of a Preceding Load is that It allows you to reuse calculated fields as many times as you would like. Syntax for it is basically like this:

     

    Table:

    Load *,

    Function4([Calculated Field]) as Field5;

    Load

    Field1,

    Field2,

    Function1(Function2(Function3( Field3) )) as [Calculated Field],

    Field4

    from ....;

     

    The interesting thing about the way it works is that it loads from the bottom up. Which then begs the question ... is it Preceding Load or PostCeding Load since you are adding to data after it's been loaded. Symantics aside a Preceding Load enables you to do a single pass through the load without requiring you to redo all of the calculations.

     

    This video example demonstrates how to created many dates off of a TIMESTAMP field in the General Hospital data set. The data loaded has to be modified to put it into the correct TIMESTAMP format and we don't wan to repeat that work for all of the other uses.