Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
obriened83
Creator
Creator

Optimize Data Model?

Hi,

   I was wondering if someone could help me with an ongoing issue that we am experiencing with our QlikView application?

We currently have an application that has over 1 billion rows of data. The problem is that this data is growing exponentially every Quarter. For example, 2014 Quarter 2 of transactions was the same amount of transactions that were made for all of 2010.

We have been noticing performance issues as our application has become more and more unstable as quarters past by.

As the business has being used to a certain window of available data, I don't believe reducing the data is an option.

I believe the key is to solving this issue is to change our data model but I've have tried to optimize it as much as possible.

(Removed Unwanted Columns, Build Fact Tables, etc). Could anyone tell me if there are ways to optimize this data model more so that the application performance can remain stable for end user use?

Attached is the Application's data model overview. Any help on this would be greatly appreciated.

Thanks again.

Cheers

Eamonn

12 Replies
m_woolf
Master II
Master II

You might want to consider document chaining.

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I would start by running DocumentAnalyzer

Qlikview Cookbook: Document Analyzer V2 http://qlikviewcookbook.com/recipes/download-info/document-analyzer/

to see if the RAM requirements could be reduced for some fields. For example,

1. Can you autonumber the keys?

2. Are Dates and Timestamps being stored in the most efficient 8 byte format?

You may want to use a reduced document for the DocAnalyzer run. The relative results should be correct even with less data.

What are the data islands for? Use of data islands can be a performance problem, esp in such a large app.

Make sure that all expressions use data from a single table and row where possible.

When you say "unstable", are there error messages in the server event log?

-Rob

PradeepReddy
Specialist II
Specialist II

Some options...

1) On opening the document, restrict the data to latest year only.

    IF the the user wants more then they will clear the selections and can see all the required data.

2) Identify the columns which can be split. (Ex:Time-Stamp)

obriened83
Creator
Creator
Author

Hi Rob,

   Thanks for getting back to me on this. To answer your questions:

1. Can you autonumber the keys?

- Yes I have turned the Cust Key to an Autonumber using the Autonumber(Cust_Key,'BLI') function.

This does improve the application run-time performance but not to a great affect. The Reload time does suffer as the load is no longer optimized.

2. Are Dates and Timestamps being stored in the most efficient 8 byte format?

- Yes I have all dates stored as 'YYYY-MM-DD' in a 8 byte format

3. What are the data islands for?

- The application is build so that user is able to first select his\her metrics and dimensions, clicks an apply button which then executes a macro that builds a Pivot table with the user's dimensions and metrics.

The execution time of the macro is pretty fast.

The data islands contains the list of all the metrics and dimensions that are available to the user.

Could a small data island table (50-100 rows) have an impact on the Application?

4. When you say "unstable", are there error messages in the server event log?

We suspect that this application could be the cause of our QlikView Server Services crashing and restarting every so often. The symptoms caused are CPU's are maxed out for about 10 minutes and then all QlikView services are restarted.

The Server Event Log gives the following error:

2014-08-10 23:46:382014-08-11 17:14:261300ErrorRestart: Server aborted trying to recover by restart. Reason for restart: Internal inconsistency, type D, detected.
2014-08-10 23:46:382014-08-11 17:14:261300ErrorRestart: Server aborted trying to recover by restart. Reason for restart: Internal inconsistency, type A, detected.

It's seems to happen randomly as recreating the symptoms proves to be difficult.  

Thanks again for the feedback.

Cheers

Eamonn

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Sorry for the delayed response, I've been of on vacation.

- The data islands per se do not cause problem, it's how they are used -- usually by expressions -- that can cause performance problems. In your case, it doesn't sound like the islands are causing performance problems, but the macro may be causing stability problems. I imagine you are using a "SetProperties" API call in the macro. This may "work", but it's never been a supported or recommended (my opinion) idea. From section 80.2 of the Ref Guide.

"The following types of functionality are not to be used in the QlikView Server environment , as they may

cause unexpected results:

l Layout operations acting on the properties of sheets and sheet objects via SetProperties

,,,"

That could be contributing to or causing your inconsistency errors. If you are running QV11, replace the macro with conditional Dimensions and Expressions which is a supported and effective technique.

- Did you scan the DocAnalyzer field sizes to see if there are any suspiciously large fields?

-Rob

obriened83
Creator
Creator
Author

Hi Rob,

     That's no problem, thanks again for getting back to me.We did use the Document Analyzer and there were fields which were identify that we optimized.


Our issue is that there is a lot of granularity in our data. For example, 1 of our main fields is a bet selection string data field where we need to display what a user selected. It is very difficult to optimize and compress this field as there is no real pattern\regularity of values. Would we need to break it into multiple columns for QlikView's compression to work?


The disk file size to this application is over 14GBs and over 100 GBs in Memory. I was wondering if you could tell me what in your opinion would be the best way to maintain scalability and performance for an application of this size?


Thanks again.

Eamonn


   

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

I would make replacing the macro a priority.

1. How much RAM does the server have?

2. Do you see any "WorkingSet: Virtual Memory is growing beyond parameters .." messages in your event log?

3. What version and SR is the server?

This may be a good candidate for document segmentation, especially as you have indicated the volume will continue to grow exponentially (congratulations on being part of such a successful business!).

The idea behind document segmentation is that you produce a qvw that has summary data for all periods and detailed data for only a recent term. For example, a sales app may have Sales numbers summarized by Region, Customer, Category and Month for 10 years, but line item detail for only the last 12 months. If the user needs to see line item detail for a prior year, they press a button that uses the "Open Qlikview Document" Action to open another qvw (document chaining) that has detail for that year.

The segmentation idea relies on being able to roll up to the smallest level of granularity required for long term analysis. I've found it will work in most cases, but you have to sit down and define the analysis scenarios with your users.

Another interesting possibility would be to use Direct Discovery. Again, that would require taking a look at the use cases.

If you have enough RAM, I believe you should be able to get the current app stable. But you will have to address the long term growth.

If you want to chat about this, contact me via the contact form at robwunderlich.com. It sounds like an interesting app.

-Rob

http://masterssummit.com

http://robwunderlich.com

obriened83
Creator
Creator
Author

Hi Rob,

    Sorry for the delay in getting back to you.

           1. How much RAM does the server have?

           The Server has 40 cores and 512 GBs of Ram


          2. Do you see any "WorkingSet: Virtual Memory is growing beyond parameters .." messages in your event                log?

             No. It's wouldn't be a virtual memory issue. In the event when it does happen all we get error messages

             such as Server aborted trying to recover by restart. Reason for restart: Internal inconsistency, type D,                                  detected.


         3. What version and SR is the server?

              We have QlikView 11.2 SR6

             The segmentation approach is a great idea, The one issue I would have is with Document Chaining as there would be an extra cost for licenses for every user who uses this application and accesses the chained document?

Thanks again for the feedback.

Cheers

Eamonn

fred_s
Partner - Creator III
Partner - Creator III

Hi Eamonn,

One simple thing to check to reduce filesize:

Settings > Doc properties > Tables

Below look for fields with lots of Unique values (#Distinct).

Do you really need all of these fields?

Do you need all of the Unique values?

Regards,

Fred