This article was sparkled by this discussion thread on Qlik Community. The question was: how to calculate the difference between dates that are stored in consecutive rows of a table? Three answers to the question suggested three different approaches. Let's analyze what's right, what's wrong and does it matter at all?
We typically use values from a previous row in the table for two common needs - either for calculating a difference between certain dates (or amounts), or for calculating a running balance of some kind (either numbering rows that belong to a certain entity, or calculating inventory balances, or calculating inventory age, etc...)
We know that any calculation involving multiple rows in a data table, requires two steps:
Usually, we need to compare the "current" value of a key field (let's call it ID) with the previous value of the same field. This way, we will know if this is still the same entity (product, customer, patient, etc...) or if it's a brand new entity. Depending on the result of this comparison, we will calculate the second step slightly differently.
Once we know whether this row belong to the same entity or the new entity, we can perform our calculation that involves the previous values.
We also know that there are two functions that look quite similar to us, Peek() and Previous(), that can help us retrieve the values from the previous row. So, the question at hand is - can we use either one of the two in both steps of our inter-record calculation, or should we prefer to use one vs. the other. Is it ever wrong to use on of the functions for one of the steps? Back to the Qlik Community discussion, the three answers to the question suggested the following:
1. Manish suggested to use the Previous() function in both steps:
If(ID = Previous(ID),Previous(DateON) - DateOFF, Null()) as TimeOFF
2. Then, Sinan suggested to use the Peek function in both steps:
If(ID = Peek('ID'), Peek('DateON') - DateOFF, Null()) as TimeOFF
3. Then, I recommended to use the following rule of thumb:
IF Previous() THEN Peek().
Now, let's find out - is there a difference, what should we use and when? The following explanation is borrowed from my new book, QlikView Your Business, that you should certainly consider if you are serious about QlikView and Qlik Sense (shameless advertising!). If you already have the book in your hands, this explanation is on page 610.
The Peek() function receives up to three parameters:
Peek('FieldName', Row, 'TableName')
It returns the value of the specified field from the specified row of the specified table. When no table is specified, the current table is assumed. When no row is specified, the last loaded row is assumed. Row numbering begins with 0 for the first record, 1 for the second record, and so on. Negative row numbers allow you to retrieve values from the last row in the table (row -1), or the second last row (-2), and so on.
The data for the Peek() function is loaded from the previously loaded table in the associative QlikView database.
The Peek() function can be used in a load to retrieve data from the same table or from another table. It can also be used outside of a load, in which case the table name and the row number are mandatory.
Notice that both the field name and the table name need to be provided as strings, enclosed in single quotes.
Conversely, the Previous() function only accepts a single parameter:
The expression is evaluated using data from the previous input record. The data is drawn directly from the input source, not necessarily from the QlikView database. This means that the expression can include fields that may not be stored in QlikView, as long as they are available in the source table. On the other hand, newly calculated fields that are created in QlikView in the same load cannot participate in this expression.
In most cases, the expression simply consists of a single field, and that’s where Previous(Field) looks similar to Peek('Field'). Notice that the field name does not need to be enclosed in quotes for the Previous() function, because the parameter is not a string, but rather an expression or a direct field reference.
Both functions can be used in a LOAD that involves calculating a running balance of any kind, but not in the same way. Just to remind you, any calculation of a running balance by one or a few key dimensions consists of two steps:
Check if the current record is the first for the set of the key dimensions. This is done by comparing a set of key dimension values from the current row with the same values from the previous row.
If this is the first record, restart the Balance from the current quantity (or from the beginning balance); otherwise, add the current quantity to the previous value of the Balance (depending on a specific scenario, you may be adding to or subtracting from the balance).
In the first step of this process, either one of the two functions can be used successfully. We recommend using Previous() to enable comparing expressions such as concatenated key dimension values, as opposed to always comparing distinct fields. Another benefit is that unlike the Peek() function, the Previous() function will not fail silently. If any field names are misspelled, you will know about it.
In the second part of the process (when calculating running balances), the Previous() function cannot be used, because the Balance field is likely to be a new field that is being created in the same load. Since the field didn’t exist before the statement was executed, the syntax checker wouldn’t allow its use in the expression for the Previous() function. In this case, the Peek() function wins, because of its string parameters and because of its silent fail. The field Balance may not exist at the beginning of the LOAD; however, it will be perfectly valid when it needs to be used for the Peek() function in the second row of the loaded table.
So, remember the following schema for any calculations of running balances:
IF Previous(), THEN Peek()
This means use Previous() in the IF condition and then use Peek() to calculate the running balance.
So, having read all that - what was the right answer to the Qlik Community question? In that particular case, when no running balances need to be calculated, all three answers are technically right. Both functions can be used equally well for calculating a difference between dates stores in consecutive rows. However, remember this rule of thumb for any calculations that involve running balances: