Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Document Performance Issue - Very slow to calculate

This is a broadened discussion from my previous post where I posed the question in a more symptom focused manner

I have a document where size on disk is 6.8 Gb and performance is very slow, making it practically useless for my end users.  There are roughly 650 million rows of data total in the document. 

The performance issues include opening time (appx 3 minutes on initial open) as well as calculating after filters are applied (anywhere from 15-20 seconds to over a minute to calculate at least one of the objects).  It's just as slow when using the dev app as it is on Access Point, also there's no significant difference in performance whether I am working on a dev server or in production. 

So far I have tried the following:

Used Document Analyzer to review data sizes and usage and made these changes

     * removed all unused fields (at least those of any significant size)

     * remove time portion of all date fields and converted dates to num format

     * replaced large key fields with AutoNumber()

Other changes made based on review of optimization best practices advice on this forum

     * minimized all but most-used objects

     * replaced commonly used expressions with parameters (there aren't very many of these though)

     * added 'conditional show' controlled by a button to some columns so not all columns are trying to calculate all the time

Another suggestion I have received but yet to try is to prevent the objects from auto-calculating until the user has made all desired selections and clicks a button to 'Apply Filters'.  This will help keep the objects from calculating after each selection when the user wants to make several selections, but it will still take quite some time once the user applies the chosen filters.

P.S. - I have other documents with similar but slightly less paralyzing performance issues ranging in size from 200MB to just over 1 Gb - this leads me to believe it's not strictly document size causing the slow calculations.

7 Replies
robert_mika
Master III
Master III

Not applicable
Author

Thanks for the reply, but as I noted above, I have already made use of Document Analyzer to clean up the document as much as possible.

JonnyPoole
Employee
Employee

There are 3 general buckets of performance tuning you can look at:

1. Application best practices

- you cited a number of these . There are too many to list here but the key is reducing variety of data values and overly complex UI expressions.

2. Hardware best practices

- hardware selection, BIOS configuration and virtualization settings can all introduce bad bottlenecks.

- nothing you do in #1 will make up for bad hardware selection or configuration, this is exacerbated with a lot of data per app.

- if you can post the full hardware specs you are using on the server: chipsets, Bus speeds, clock speeds etc...


3. Segmentation

- leverage loop/reduce  & doc chaining to turn a large monolithic app into an array of well performing apps with drillable context from one to another

My recommendation would be a review with Qlik Services or strong implementation partner to fine tune the bottlenecks with this and other applications with large data sets

Not applicable
Author

Hey Ashley,

did you also check the following points?

  • avoid string comparison / if in calculation condition like if (getfieldselections(field A) like *blubb*,1,0)
  • make sure variables do not include = sign. --> this will force the variable to calculate anytime you do a selection. use $(variable) in expression instead
  • avoid if statements in list boxes / nested if in general

How is your dm approach / what does it look like?

Not applicable
Author

I have no string comparisons in the doc, and I have replaced IF statements with set analysis where possible, leaving very few IF statements (none at all on the main page where I'm currently focusing my attention).

At your suggestion I made sure no '=' were included in variable definitions, but none of the variables in use on my main page had the '=' anyway

Data Model: Aside from a few small dimension tables,  I have a fact table for Order Header, and another for Order Detail with 281 and 291 Million records respectively, a Person table with 61 million records and an Order History table (55 million records) which aggregates order header data at the Person ID level. The last large table in the data model is a bit difficult to explain but was created for a specific chart (not appearing on the main tab) that allows the user to measure ordering habits before and after the selected product was first ordered.  So it concatenates products from Order Detail by order ID.  That table could be weighing down the document on its own, but this document was already performing poorly before that table was added to the model, so I know it's not the sole issue.  I can try moving it to its own doc and see what that does.

I have toyed with ideas of aggregating the data, but given the level of dimensionality required by the users, it really can't be aggregated.

marcus_sommer

This meant you have several large fact-tables and already done some efforts to optimize the app-performance. I would try to rebuild the datamodel more in the direction on a star-scheme and merge (concatenate + joining + mapping) the fact-tables and maybe also one or the another dimension-table to a large fact-table. Probably will be your app-size and the reload- und load-times increasing but often is the gui-performance with one huge fact-table better then with several large fact-tables which are directly associated or per link-table connected.

Another point could be to transfer calculation-steps within the script, for example you should rather not to have period-related set analysis filter like: sum({< Year = {"$(=max(Year) - 1)"} >} AnyValue) /* for the previous year */ and using instead flag-fields for this which enabled you for expressions like: sum(AnyValue) * vPreviousYear. This again will increase your app-size but be faster within the gui-calculation. The aim is also to avoid filters within the expressions and other categorizing-matchings like calculated dimensions or similar.

I suggest to use for this a step by step approach and not implementing it altogether and to consult external help like suggested from Jonathan might help to save valuable time.

- Marcus

Anonymous
Not applicable
Author

Hi Ashley,

Here is a good whitepaper that discusses among other things Qlik application architecture and different ways applications can be structured to maximize performance. Well worth the read for any Qlik developer/app architect dealing with larger data sizes and how to manage them in an efficient way.

http://global.qlik.com/uk/explore/resources/whitepapers/qlikview-scalability-overview

Cheers,

Johannes