Hey everyone, a large problem with unstructured data is that it's, well, unstructured. One of the most common problems we see here at Qualcomm, especially with testing or engineering data is when you have a table of records that tries to merge cells accross rows in Excel. However when it's read in line by line, only the first record receives a value while remainder records are null.
For example, take the data set:
As you can see above, when the Build field needs to repeat its value becomes null because it was a row span. Only the first row contained the value and the remaining records, until the following build, are null.
Introducing the peek() function...
One method to solve this is to load your data into two nested loads. The first load will create a counter field. Basically it will count the records from 1 to n and then reset itself back to 1 when it encounters the next build. The outer load will then take this counter and apply it to the peek function to retrieve the record that countained the last build number. The code is below:
if(isnull([Build ID]), peek([Build ID],(Counter)*-1), [Build ID]) as [New Build ID],*;
if(previous([Build ID]) <> [Build ID], 1, peek(Counter) + 1) as Counter,*;
Wallah! Once you run the above code you should receive a table that looks like this:
|Counter||Date||New Build ID||Build ID||Result|
Hope this helps!