Qlik Community

QlikView Documents

QlikView documentation and resources.

Document boards are being consolidated, this board no longer allows NEW documents READ MORE

ByteSizes from Values in QlikView

Showing results for 
Search instead for 
Did you mean: 
MVP & Luminary
MVP & Luminary

ByteSizes from Values in QlikView


ByteSizes from Values in QlikView

I think you know that QlikView had a very effective way to handle and store data. The general approach is described here: Symbol Tables and Bit-Stuffed Pointers and here: Symbol Tables and Bit-Stuffed Pointers - a deeper look behind the scenes which is beside the special logic how data are associated to each other is the main-reason why QlikView is so fast calculating even huge amounts of data.

But even with this knowledge it's not easy to comprehend how much RAM an application consumed and which parts are consuming a reasonable amount of RAM and which parts might need an optimization. Even more complicated is the attempt to predict which changes within the datamodel would have which effect on the RAM consumption.

The calculating of the RAM consumption from the pointers is relative easy and correlated directly with the number of distinct values within one field: The Importance Of Being Distinct.

The calculating from the values of the symbol-tables is much more difficult then it's depending how the data type from a value is interpreted. Compared with the data-handling from many other programs and databases there is no data type within QlikView: Data Types in QlikView but from a RAM point of view this is too much simplified.

For example, a date, created with date(DateNumField), will need 8 Bytes but only if there are any gaps of dates within the field - are the fieldvalues continuously without a gap the RAM consumption is 0 Bytes for this field. This meant the fieldvalue won't be stored directly else it will be stored with further meta-data to the symbol-tables (I could imagine that the $$SysTables contain these information).

I assume that for this a start- and an end-value or an appropriate offset to the pointer (which then worked like an autonumber-field) will be stored together with the format-information and will be later calculated on the fly.

But there are further complications. If the date-field is automatically detected from the script through the default-variables on the script-start the date-field will be handled like a dual-field and will by dual(date(Date, 'DD.MM.YYYY'), DateNumField) consume 20 Bytes per value. It will be even more complicated if the field contained several different formatting's and/or is it mixed with some string-values (one wrong value because of a poor data-quality might be enough to change the interpretation).

I think this makes it clear how complicated it is to detect how the data are interpreted and stored and even more how it could be controlled and where is the best place to transform the data. By small datasets you don't need to worry much about but already by mid-sizes datasets it might be noticeable - depending then from the kind of datamodel.

By a (ideally) star-scheme with relative less dimension-values compared to the fact-table most of the formatted values will be within the dimension-tables and it might not make much difference to the whole datamodel. Here you will find a similar example: Is it usefull to format data in a Load script for a QVD?

But are there any issues with the performance of an application (and you could exclude most of the common one like unneeded fields, large link-tables …) it might be worth to take the efforts to optimize the application by the needed byte size of their values.

Looking on this (old) posting: Memory sizes for data types I decided to develop a small tool to make it handier to detect how much RAM a field needs - in total and as average per value.

The application is quite simple and contained a small generic load with some rand- and mod-functions to get a few different results within the fieldvalues and also a choosable number of records. The fields itself are a variety of different period-, number- and string-values to demonstrate that the same respectively similar values could be handled quite differently.

You could adjust these fields to be equally or near your real-data or replace this table with one of yours. To make it more practically there is a small macro-routine included which reload these application without mem-file, stored the mem-file and reloaded then again with the mem-file (you might want to adjust these logic or do it manually if you are dealing with larger tables). To see what is meant with mem-file take a look here: Recipe for a Memory Statistics analysis and Obtaining Memory Statistics for a QlikView Document.

Here the used script:

/******************************** [ creating various data types ] *********************************

- creating a variety of random period-, numeric- and string-values per autogenerate-loop
- different fieldvalues will be created through the use of rand(), mod() and recno()


recno() as RecNo,
autonumber(date(today() + recno())) as Autonumber,
date(today() + recno(), 'DD.MM.YYYY') as DateContinuous,
date(today() + mod(recno(), 2), 'DD.MM.YYYY') as DateWithGaps,
as DateAutomaticInterpreted,
month(today() + recno()) as Month,
monthname(today() + recno()) as MonthName,
year(today() + recno()) as YearSingleValue,
year(today()) + mod(recno(),2) as YearMultipleValue,
time(frac(now()) + recno() / rand() * 86400) as Time,
timestamp(now() + recno()) as TimestampSingle,
dual(timestamp(now() + recno()), now()  + recno()) as TimestampDual,
timestamp(now() + mod(recno(),2) + rand(),
    'WWWW DD. MMMM YYYY hh:mm:ss')
as TimestampLongSingle,
dual(timestamp(now() + mod(recno(),2) + rand(), 'WWWW DD. MMMM YYYY hh:mm:ss'),
now() + mod(recno(),2) + rand()) as TimestampLongDual,
chr(recno() + 65) as LetterAscii,
chr(recno() + 130) as LetterAsciiExtended,
floor(recno()) as IntegerFloor,
ceil(rand() * 100) as Integer,
text(chr(recno() + 48)) as NumberAsString,
if(mod(recno(), 2) = 0, text(chr(recno() + 48)), chr(recno() + 48)) as NumberAndString,
dual(ceil(rand() * 100), ceil(rand() * 100)) as DualInteger,
ceil(rand() * 1000000000000) as BigInteger,
rand() * 100000 as SmallFloat,
ceil(rand() * 1000000000000) + rand() * 100000 as BigFloat
autogenerate $(i); // variable i could be controlled within the gui per inputbox

/******************************** [ loading the mem-statistics ] **********************************

- creating a variety of random period-, numeric- and string-values per autogenerate-loop
- different fieldvalues will be created through the use of rand(), mod() and recno()


if '$(vLoadMode)' = 'including Mem-File' then

num(num#(Size, '#.##0', '.', ','), '#.##0,00', ',', '.') as Size,
   FROM [ByteSizesFromValues.mem]
txt, codepage is 1252, embedded labels, delimiter is '\t');
end if


Here the used macro-routine:

option explicit

sub StoreMemFile

dim doc, path
set doc = ActiveDocument

path = doc.evaluate("=subfield(DocumentPath(), '.', 1)") & ".mem"

doc.Variables("vLoadMode").SetContent "excluding Mem-File", true 'set the load-variable to exclude
doc.Reload 'reload the application without the mem-file
doc.SaveMemoryStatistics path 'store the mem-file
doc.Variables("vLoadMode").SetContent "including Mem-File", true ' set the load-variable to include
doc.Reload 'reload again and this time with mem-file

end sub

And this is the result of the application:

with a short overview about the commonly data-types and their formattings:





Number as String

11 Bytes + 1 byte for each char

'7' = 11 Byte and '2222' = 14 Byte


7 Bytes + 1 byte for each char

'a' = 7 Byte and 'abc' = 9 Byte

AlphaNumeric Extended

8 Bytes + 1 byte for each char

'ò' = 8 Byte and 'òõ' = 9 Byte

1 Byte more for chars with a larger index as 127, for example: chr(130)


8 Bytes

integer/float, positive/negative or size are not important but only if the whole field is numeric

Mixed Number + String

combination of the used values

'a' = 7 Byte  + 1 = 8 Byte = Ø 7,5 Byte


0 Bytes

fieldvalue = pointer

consecutive numbers

0 Bytes

recno() will return a consecutive number

will be handled like an autonumber - probably with an offset to the field-value which then could be used as pointer

Date (consecutive + without dual)

0 Bytes

Date (non consecutive + without dual)

8 Bytes

will be handled numeric with a separate storing of the format-information

Date (automatically detected)

20 Bytes


Date (dual + formatted)

20 Bytes


Month (dual)

13 Bytes


MonthName (dual)

18 Bytes



8 Bytes


Time (not dual)

8 Bytes

Time (dual)

22 Bytes


Timestamp (dual)

33 Bytes

'DD.MM.YYYY hh:mm:ss'

Timestamp (dual)

47 Bytes

'WWWW DD. MMMM YYYY hh:mm:ss'

the number of chars which the formatting returned determined the byte size

I hope playing with this tool will help you to understand how QlikView handled the data from the symbol-tables and you could adapt some points to improve the performance from your applications and don't hesitate with feedback and ideas to improve the tool.

Happy Qliking!

Marcus Sommer

Labels (2)

Thanks Marcus,

very informative, this Topic contributes to understand deeply how QV stores the Data

Not applicable

I have performed analysis at byte level and fine tuned it which gave awesome results. It improved response time and reduced memory on disk and RAM. Leadership team is quite happy to see this results within less time.

Thanks Marcus for documenting .

-Ramesh A

Version history
Last update:
‎2016-08-22 01:46 AM
Updated by: