Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Poor Performance: 16GB RAM, 200,000 KB QVW file, 30M rows, 3 columns

I have a blank QVW canvas loaded with only one table's data:

  • 30M rows
  • 3 columns

qlikview_trouble_simple_model.PNG

Here are the columns:

qlikview_trouble_table_tab.PNG

Here are the memory statistics:

      

ClassTypeSubTypeIdBytesCountSizeCalcTimeAvgCalcTime
VariableVariableInternalCD
VariableVariableInternalQvPath
VariableVariableInternalQvRoot
VariableVariableInternalQvWorkPath
VariableVariableInternalQvWorkRoot
VariableVariableInternalWinPath
VariableVariableInternalWinRoot
VariableVariableInternalErrorMode
VariableVariableInternalStripComments
VariableVariableInternalOpenUrlTimeout
VariableVariableInternalScriptErrorCount
VariableVariableInternalScriptErrorList
VariableVariableInternalScriptError
DatabaseTableRecordsData179206446298677416
DatabaseTableRecords$$SysTable 1331
DatabaseTableRecords$$SysTable 2111
DatabaseTableRecords$$SysTable 3331
DatabaseFieldSymbols$Field34311.33
DatabaseFieldSymbols$Table10110
DatabaseFieldSymbols$Rows18118
DatabaseFieldSymbols$Fields11111
DatabaseFieldSymbols$FieldNo33311
DatabaseFieldSymbols$Info616
DatabaseFieldSymbolsamount254824318538
DatabaseFieldSymbolsowner4263556552021449521.09
DatabaseFieldSymbolsState7298
State SpaceTable StateInternalData29867741
State SpaceTable StateInternal$$SysTable 13
State SpaceTable StateInternal$$SysTable 21
State SpaceTable StateInternal$$SysTable 33
State SpaceField StateState$Field313
State SpaceField StateSelection$Field0
State SpaceField StateState$Table111
State SpaceField StateSelection$Table0
State SpaceField StateState$Rows111
State SpaceField StateSelection$Rows0
State SpaceField StateState$Fields111
State SpaceField StateSelection$Fields0
State SpaceField StateState$FieldNo313
State SpaceField StateSelection$FieldNo0
State SpaceField StateState$Info111
State SpaceField StateSelection$Info0
State SpaceField StateStateamount31853131853
State SpaceField StateSelectionamount0
State SpaceField StateStateowner20214495120214495
State SpaceField StateSelectionowner0
State SpaceField StateStateState919
State SpaceField StateSelectionState0

My machine has 16GB of RAM.

The file is ~ 200,000 KB

The data is loaded from a simple load-script which takes a reasonable amount of time

So I have the data and a blank Qlikview canvas.... lovely:

I add a Chart:

Dimension : owner

Expression : SUM(Amount)

Dimension Limit : Top 10, or maybe Top 20... Very small subset

When I press "OK" on the CHart it may render, or it may (crashes due to Memory Allocation Exceeded, or "failed to draw chart")

If it renders, it takes a lengthy delay, between 2 and 8 minutes.

Once it renders, re-sorting descending vs ascending will cause another lengthy delay. Or it crashes.

I assume any other Qlikview activities would also be unbearably slow (filtering/making a selection, exporting to Excel, etc)

While it tries to render, my Task Manager shows Memory varying between 11.5 GB and the 16GB max  (If Qlikview is closed, it hovers ~ 8GB)

And it sees CPU Usage varying between 50% and 100% (if Qlikview is closed it hovers ~ 40%)

  • The table model is super-simple, so I don't think I can make improvements there.
  • The canvas would have only one single Straight Table chart on it (with Dimension Limits!), so I don't think I have too many sheet objects. (Note: The memory statistics from above probably don't include any chart objects, because when I try to add a chart, it's either too slow or it crashes)
  • Can any of the options on the Document Properties > Tables tab help me?
  • How can I understand what's taking so long?
  • General questions: What is CPU Usage and why is CPU Usage spiking if I have more Physical Memory it could use (instead? of CPU?) ?
  • This data did come from a SQL Database (View). Can I tell you anything about that View? Can I use Qlikview "Direct Discovery" to reduce the processing on my computer, and instead make the Database do the processing?

I can accept that "well, it's just too much data for your simple/weak personal computer!" if that is the answer, I thought 16GB of RAM was quite powerful, but maybe I'm wrong...

But I'm frustrated because me I've had a bigger table, like:

40M rows

10 columns

Which results in a bigger QVW file, like:

900,000 KB (almost 1GB)

And many more Sheet Objects on the canvas, including a chart, a search object.

... and it seems the bigger file was both faster and more stable than the smaller file ...

Any advice appreciated...

1 Solution

Accepted Solutions
Clever_Anjos
Employee
Employee

That´s because your owner field has a very high cardinality 2.021.449 distinct values.

An straight table with "top 10" must calculate 2.021.449 times SUM(Amount), sort them and calculate top 10.

Qlikview/Sense is very impacted by distinct values

Related reading:

Symbol Tables and Bit-Stuffed Pointers

http://qlikdevgroup.com/wp-content/uploads/Hamburg-Event01-Session01-QIX-Engine-Under-the-hood.pdf

View solution in original post

29 Replies
Siva_Sankar
Master II
Master II

Nathaniel,

Since you said data model is simpler, you can check whether you are loading all the fields from the source or the only required fields in the source. Or you can find the unused fields from the document analyzer and remove it, you can find it here Document Analyzer V2 -- a major update

second thing is, doc analyzer not just give you the list of unused fields info..but it will also gives you which object consumer more memory, which expression takes more calc time and much more. you can find the details and instructions in the application itself.

if you data model keys are concatenated fields, then you can try using autonumber function while creating the key, it will reduce more memory footprint.

if there is field with more distinct values and you find it not necessary, you can sort out a way to remove or modify..

example say if you have transaction date with timestamp...and you actually no need of the timestamp in transaction date by trimming it..this will greatly reduces the memory and calc time.

try aggregate the data in the script itself. Move the complex expression from the chart to script.

calculated dimensions can be avoided or create a calculated column in the script and bring it inside the chart as single field

Siva

Clever_Anjos
Employee
Employee

That´s because your owner field has a very high cardinality 2.021.449 distinct values.

An straight table with "top 10" must calculate 2.021.449 times SUM(Amount), sort them and calculate top 10.

Qlikview/Sense is very impacted by distinct values

Related reading:

Symbol Tables and Bit-Stuffed Pointers

http://qlikdevgroup.com/wp-content/uploads/Hamburg-Event01-Session01-QIX-Engine-Under-the-hood.pdf

Not applicable
Author

Thanks, Clever! I'm glad you could point me at the main problem, rather than suggest general optimizations.

I also suspected "high cardinality" of owner field is the main (only?) problem.


I wanted to prove this by creating a fake dataset who has a string column with 29M total values, 20M distinct values --

I should expect similar slow performance from even the fake dataset right?

In fact, to generalize:, anyone with comparable hardware should experience comparable slow performance with 20M cardinality on any string field they have, regardless of dataset, right?

----

(On a side-note, let's assume each string value is , on average, 20 bytes long. I think that's how I can interpret  the "Size" column in my memory statistics, right?  "Bytes" divided by "Count"?:)

DatabaseFieldSymbolsowner4263556552021449521.09

This assumption is important because an average length of 30 bytes or 100 bytes probably behaves differently -- it's more expensive to calculate uniqueness if the strings are longer. Well, I guess that depends on the string algorithm...

Clever_Anjos
Employee
Employee

"I should expect similar slow performance from even the fake dataset right?"

Yes, I´m almost sure about it. I´ve made myself a fake database similar to yours and my laptop i7 with 8Gb hung while calculating that top 10 straight table.


"In fact, to generalize:, anyone with comparable hardware should experience comparable slow performance with 20M cardinality on any string field they have, regardless of dataset, right"

Yes, similar hardware should perform similarly.


"This assumption is important because an average length of 30 bytes or 100 bytes probably behaves differently "

I agree with you.

Changed: every string (using 30 or 100 bytes) is stored only once into Symbol Table generating a pointer. That behaviour should be different but not so much.



vinieme12
Champion III
Champion III

Alternatively Create a table with aggregated values/ Call it a summarized table for all your possible evaluations, that will take off the load from your dashboard

Vineeth Pujari
If a post helps to resolve your issue, please accept it as a Solution.
Clever_Anjos
Employee
Employee

In this case, pre aggregating would not speed so much. We would reduce 20Mi to 2Mi records, and the qliikview would have to sort them to create the top 10.

We would face another concern. a LOAD with a Group By is really CPU expensive (usually is mono-core) taking a loooong time to calculate

vinieme12
Champion III
Champion III

Yes; well the reduction in the number of rows would ideally depend on the the dimension values,period range and the level of granularity that the is desired...

I prefer putting the load on the data load  than making the business users wait for the chart objects to load

Vineeth Pujari
If a post helps to resolve your issue, please accept it as a Solution.
Clever_Anjos
Employee
Employee

We were discussing the particular case that we have 2 million different values to be aggregated

Not applicable
Author

Oh that's right, the symbol table can reduce /eliminate effect of string lengths. Cool.        

Specifically, if I have 1000 unique values, it doesn't matter if the average string is length 10 or length 1000. 1000 unique values means the symbol pointers (during "bit-stuffing") will use minimum number of bits to accommodate 1000 unique values.

So, symbol pointers are the same regardless of string length, which means my data table (which contains only pointers) will be same.

However, as Henric said, the symbol table itself will naturally be bigger if strings are bigger:

"The length of the symbols will affect the size of the symbol table."