What happens when you load a raw data element that you need to do some work on ... but you also need to use that modified field value multiple times afterwards?
Well you could do the Load statement
Function1(Function2(Function3( Field3) )) as [Modified Version of Field 3],
Then do a self join back to the table to add the other fields
Function4([Modified Version of Field3]) as Field5
That's easy just time consuming. Or you can simply redo all of the work you did over and over and over to create Field5, 6, 7, 8, 9 and 10. Easy for you to copy and paste ... but then you use lots of CPU for reloads and you have to continually maintain any changes you make to what you edit.
The concept of a Preceding Load is that It allows you to reuse calculated fields as many times as you would like. Syntax for it is basically like this:
Function4([Calculated Field]) as Field5;
Function1(Function2(Function3( Field3) )) as [Calculated Field],
The interesting thing about the way it works is that it loads from the bottom up. Which then begs the question ... is it Preceding Load or PostCeding Load since you are adding to data after it's been loaded. Symantics aside a Preceding Load enables you to do a single pass through the load without requiring you to redo all of the calculations.
This video example demonstrates how to created many dates off of a TIMESTAMP field in the General Hospital data set. The data loaded has to be modified to put it into the correct TIMESTAMP format and we don't wan to repeat that work for all of the other uses.
Feel free to download the attached video and watch at your leisure.
Thank you for sharing! It is my understanding that a preceding load cannot be used on a wildcard load (FROM MyData_*.csv). Is there a workaround to accommodate such a load?