In QlikView help for Keep it says Joins should be avoided suggesting that Keep is a better option.
In help it says:
"The explicit Join prefix in QlikView script language performs a full join of the two tables. The result is one table. In many cases such joins will result in very large tables. One of the main features of QlikView is its ability to make associations between multiple tables instead of joining them, which greatly reduces memory usage, increases processing speed and offers enormous flexibility. Explicit joins should therefore generally be avoided in QlikView scripts. The keep functionality was designed to reduce the number of cases where explicit joins needs to be used."
Yet when I have seen most other developers of QlikView work, they always use Joins.
Why is this?
QlikView is not a database. Whilst you can create a data model by treating QlikView like a relational database, this will not create an efficient data model, especially when working with large data sets.
The other alternative to join and keep is to use ApplyMap.
IMHO, the statement 'Explicit joins should therefore generally be avoided in QlikView scripts.' is just not valid in its generality.
You need to understand how QV works, e.g. how it makes use of bit stuffed pointers, data tables and symbol tables.
Then what the difference is between linked tables and joined tables, when it comes to aggregation (possible data duplication).
And what the difference is regarding performance (trade-off between memory consumption and possible speed advantage because there is no need of run-time joins).
I think only then you can assess which data model suits your needs best (and sometimes you need to make tests with your real data with different models to make a solid statement).
Besides this, I think the real alternative to a KEEP would not be JOIN, but a WHERE EXISTS clause.
That's an interesting question. I agree with swuehl that the statement "Explicit joins should therefore generally be avoided in QlikView scripts" is way too broad, I would even say incorrect. Some additional thoughts.
1. First, coding for clarity and maintainability is typically the first priority, as it should be, So people code what is familiar and easiest to understand.
2. KEEP is poorly understood. It's mentioned in the QlikView Developer course, but no exercises use it, whereas there are several exercises with JOIN. It has no analogy in SQL (the baseline knowledge for many), it's a QV specific thing. I very rarely use KEEP, or even remember to consider it.
3. Most people keep dimensions in separate tables and filter the load using "where exists()". That's a pretty straight-forward construct that has the same outcome as KEEP.
4. For a 2 field Dimension table of Key and Value, a more efficient data model results from JOIN followed by a DROP of the Key.
5. There are cases where JOINing tables is desirable, such as:
- Placing all measure fields in the same table can be an important performance boost in large apps.
- Resolving circular references or synkeys.
I think KEEP is cool, and I'm probably overlooking some great use cases for it. But I think the wording of the doc is perhaps dated (32 bit era) and a little too strident.
> 4. For a 2 field Dimension table of Key and Value, a more efficient data model results from JOIN followed by a DROP of the Key.
For a 2-field Dimension (Key & Value) surely creating a mapping table and using ApplyMap is more efficient than joining then dropping the key?