Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Employee
Employee

The QlikView Cache

QlikView has a very efficient, patented caching algorithm that effectively eliminates the calculation time for calculations that have been made before. In other words, if you use the “back” button in the toolbar, or if you happen to make a selection that you have made before, you usually get the result immediately. No calculation is necessary.

But how does it work? What is used as lookup ID?

For each object or combination of data set and selection or data sub-set and expression QlikView calculates a digital fingerprint that identifies the context. This is used as lookup ID and stored in the cache together with the result of the calculation.

Image2.png

Here "calculation" means both the Logical Inference and Chart calculation - or in fact, any expression anywhere. This means that both intermediate and final results of a selection are stored.

There are some peculiarities you need to know about the cache…

  • The cache is global. It is used for all users and all documents. A cache entry does not belong to one specific document or one user only. So, if a user makes a selection that another user already has made, the cache is used. And if you have the same data in two different apps, one single cache entry can be used for both documents.
  • Memory is not returned, when the document is unloaded. Cache entries will usually not be purged until the RAM usage is close to or has reached the lower working set limit. QlikView will then purge some entries and re-use the memory for other cache entries. This behavior sometimes makes people believe there is a memory leak in the product. But have no fear – it should be this way. So, you do not need to restart the service to clear the cache.
  • The oldest cache entries are not purged first. Instead several factors are used to calculate a priority for each cache entry; factors like RAM usage, cost to calculate it again and time since the most recent usage. Entries with a combined low priority will be purged when needed. Hence, an entry that is cheap to calculate again will easily be purged, also if it recently was used. And another value that is expensive to recalculate or just uses a small amount of RAM will be kept for a much longer time.
  • The cache is not cleared when running macros which I have seen some people claim.
  • You need to write your expression exactly right. If the same expression is used in several places, it should be written exactly the same way – Capitalization, same number of spaces, etc. – otherwise it will not be considered to be the same expression. If you do, there should be no big performance difference between repeating the formula, referring to a different expression using the label of the expression or using the Column() function.

The cache efficiently speeds up QlikView. Basically it is a way to trade memory against CPU-time: If you put more memory in your server, you will be able to re-use more calculations and thus use less CPU-time.

HIC

Further reading on the Qlik engine internals:

Symbol Tables and Bit-Stuffed Pointers

Colors, States and State vectors

Logical Inference and Aggregations

72 Comments
darkhorse
Valued Contributor

Very helpful post.

Interesting point with the expression writing though.

I wonder if it makes a noticeable difference. Need to test.

19 Views
Not applicable

Hi Henric,

The cache is global.


If section access is enabled what will happen? Will it blindly won't consider cache data or It will verify whether both users are having same level authorization to data access?


Karthik

19 Views
richard_pearce6
Valued Contributor

Thanks, very informative and concise


19 Views
simondachstr
Valued Contributor III

Thanks for the post. In the QMS API documentation, there's a method called ClearQVSCache which can "clear" the following members:

Member nameValueDescription
None0            No object specified.           
License1            Cached QlikView Server license information.           
Settings2            Cached QlikView Server settings.           
UserDocumentList4            Cached QlikView Server user document file structure.           
UserDocumentMetaData8            Cached QlikView Server user document meta data.           
CALConfiguration16            Cached QlikView Server CAL configuration.           
All65535            All cache objects.           

Can you maybe please elaborate on those cache-members and how they are different to what you described in your blog.

19 Views
chris_johnson
Contributor II

Hi,

Interesting article Henric.

Does this also apply with using variables as Expressions? If I were to use the expression $(SalesFigure) in multiple charts so that this is consistent would this be considered as the same expression, and would there be any additional overhead with having to resolve the content of the variable?

Thanks,

Chris

19 Views
MVP
MVP

You need to write your expression exactly right.


This seems to be very interesting. I've never heared before and I wonder why the expressions aren't compared after parsing. Do comments have an effect also? I see there some ptential improvement..


- Ralf

19 Views
dvqlikview
Honored Contributor II

Many thanks HIC. Interesting post and one of my favorite topic.

I agree with Ralf regarding - You need to write your expression exactly right.

Chris - I think expressions will be cached even if we use variables with dollar sign expansion. I'd think the variables are containers of expressions instead of just holding the absolute value.

Does anyone tried or considered using SSD and using the Swap File for caching?

Thanks,

DV

0 Likes
19 Views
barryharmsen
Contributor II

Great post Henric! For those who really want to experience how much performance caching adds, I've written a little post that shows you how to turn off caching (don't forget to turn it back on!). The difference is extreme: The power of QlikView caching » The Qlik Fix! The Qlik Fix!

Ralf Becher the "your expressions need to be exactly the same" thing was covered nicely in this blog post from last year, a very interesting discovery: Performance (and other) benefits of using expression column and label references « BI Commons

19 Views
Employee
Employee

Since the database will be different for different levels of authorization, the thumbprints will not match.

19 Views
Not applicable

Very informative. But some questions come up:

And if you have the same data in two different apps, one single cache entry can be used for both documents.

If a 'Total' app is distributed along with a level1 loop-and-reduced and a level2 loop-and-reduced set of apps, is it really so, that cache is shared between these apps? Example: cache from level1, value1 app will be available when a user select value1 in the Total app.

And second, if level1 is present in another app used with another set of data and loaded with different qvd's, then level1 is only present once in memory?

0 Likes
19 Views
Not applicable

Very interesting article. Congrats!

0 Likes
19 Views
Employee
Employee

Karthikeyan S Just as Johan Idh points out, the data sets are different (due to Section Access) so the digital fingerprints will not match. Hence, different results if the users have different authorization scopes. (By the way, Johan is one of our developers, so you should listen to him more than to me... )

HIC

19 Views
Employee
Employee

Martin Mahler The Blog post is about the Cache in the QlikView engine, which you can find in the QlikView server service or in QlikView Desktop. It handles selections and calculations. Your question is about managing the QlikView Server, which is something different. It has its own cache, which is implemented differently from the QlikView engine. So - it is not in the same cache.

HIC

19 Views
Employee
Employee

Jerrik Walløe Yes the cache is used for all three apps. But note, what you call "Total", "Level1" and "Level2" are most likely three different data sets, i.e. different digital fingerprints. Whereas "Level3a" and "Level3b" might have the same data set (but different distribution groups) so they would share the cache entries.

HIC

19 Views
Employee
Employee

As Henric already mentioned, the ClearQvsCache method in the QMS API is specific for the QMC cache of QVS entities, not the QVS cache itself. QMS will cache lists of documents and CALs and such, and this method purges that cache.

So (un)fortunately, they have no direct relation.

19 Views
hariharasudan_p
New Contributor III

Henric,

Thanks for the wonderful post and nice discussion

Barry/Henric,

Apart from settings-> Document properties, where i can find calc time for each chart, [ where most of time i get 0] is there an accurate way of finding how much memory does a chart takes to calculate ?

0 Likes
19 Views
darkhorse
Valued Contributor

Try Document Settings-->Sheets--> Calc Time of objects.

0 Likes
19 Views
Not applicable

Great post, Thanks!

0 Likes
19 Views

Good post.  Well worth the read.

In QV Desktop when developing I notice that occasionally the dashboard does not datawise truly reflect the changes I have made.  I suspect this is because my changes have confused the caching.  Closing the QV Desktop and restarting it sorts these issues, I guess because the close / restart clears the cache.

Does anyone know of a way of clearing the cache in QV Desktop without the  close / restart ?

Best Regards,     Bill

0 Likes
19 Views
pascal_theurot
Contributor II

Very interesting as the "under the Hood" session in Qonnections.

Thanks.

0 Likes
19 Views
Not applicable

That's interesting question. (I mean Chris question: Does this also apply with using variables as Expressions? ) I'd like to ask related question:

Let's say we have to tables: first one shows us absolute numbers and uses $(SalesFigure) as expression, the second one shows per customer values: $(SalesFigure)/Count(distinct customer_id). It is requirement to keep it in separate tables so it impossible to use Column() function.

The question is whether QlikView would be able utilize cache for second table?

0 Likes
19 Views
Not applicable

Johan IdhHenric Cronström

Thanks for both..

0 Likes
19 Views
msteedle
Contributor

I would be surprised to find out how the same cache entry can be used across documents, considering that QlikView doesn't even seem to reuse the same expression grouped by the same dimension in different chart types within one application. Ex. a straight table with Sum(Sales) by Customer and a bar chart with Sum(Sales) by Customer are calculated independently within an application. At least, this appeared to be the case when I tested this in an earlier version of 11.0.

0 Likes
19 Views
Employee
Employee

It should use the cache across documents and my experience is that it does.

If you have a case where the same combination of dimension and expression doesn't use the cache, there is probably a reason, e.g. that the expression (or a calculated dimension) is written in different ways in the two places. Either that, or you have found a bug. Because it should re-use the cache entry...

HIC

19 Views
Employee
Employee

Bill Markham

Create a macro, and run this from a button:

Sub ClearCache

ActiveDocument.ClearCache

End Sub

HIC

19 Views
Employee
Employee

It was new to me that QlikView uses cache across Documents. What I cannot imagine is, how can the engine know, if a stored aggregate cache from one document is valid for the other document and therefore can be taken? It could be that common field names are pure coincidence without relevance

0 Likes
19 Views
Employee
Employee

That's the patent!

The digital fingerprint contains information both on field names and the data set behind. So it does not just use the field name. A different data set will inevitably lead to a different fingerprint, whereas two apps with identical data sets will get the same fingerprint.

HIC

19 Views
barryharmsen
Contributor II

And this digital fingerprint is a hash, right?

19 Views
Employee
Employee

Yes, a 256-bit hash.

HIC

0 Likes
19 Views
Employee
Employee

Is an aggregate cache also able be re-used "in chunks"? Assuming this: User selects countries AT,DE,CH and a chart renders Sum(Sales) over the Dimension Country. Then the user deselects CH, which is a subset of what was previously calculated and cached. What will happen?

Can an aggregate cache be reused if the dimensionality is different (less granular)? I mean If a chart has two dimensions, Year and Country and calculates Sum(Sales) and another chart would use only Year and Sum(Sales), can the calculation of the 2nd build the result out of the cached 1st aggregate and save time?

Thank you

19 Views