Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Is there a faster way to calculate a dimension?

Hi,

We use Qlikview for a dataset of 1.5 billion rows.  Due to its size, normal methods of calcuating fields can take too long and I have to look for another method.

I am really struggling with the speed of one calculation that uses the aggr function.  We want to roll up item level data to calculate the number of people who have bought 1,2,3 etc of a particular item in one transaction (a basket).  The outcome looks as follows...

Target Units          Basket Count

     1                         8,000,000

     2                         6,000,000

     3                         3,000,000

     4                         1,000,000

The calculated dimension and expression formula I have used for this is below.  Does anyone know another way we could get this result using a different method that might run faster?  The approach of using a dimension with the aggr function is taking 5 minutes to calculate!

Dimension =aggr(distinct sum(SALES_UNITS), BasketNumber)

Expression = count(distinct BasketNumber)

Many thanks for any help.

10 Replies
johnw
Champion III
Champion III

Jason Michaelides wrote:

...Have you considered breaking up the qvw based on a dimension (gov_office_region, for example)? Or restricting the date range per application?  You can use document chaining to move to archived data if necessary - is it really important to have all data available in the same application?  I think you may need to start thinking along these lines...

As a simple example of the above, our largest application has five years of manufacturing data and can be quite slow.  However, perhaps 95% of the time, the users are only interested in the current year.  So we use Publisher to build a YTD version with only the current year in it, and THAT version seems to have acceptable performance.

In a related case, I was building a chart summarizing all the data in that application, and despite trying multiple approaches and spending days on tuning, it still took over a minute to render.  I ended up building a separate application with a data model designed explicitly to support that one chart, and it renders just fine.  It's not ideal to split things up like this, but sometimes it may be the only practical option.