Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Set Analysis for Prior Periods Cumulative Sum with Non Contiguous Period Selections

Hi,

I am working with some data whereby I need to calculate and display cumulative balances over time.  To maintain maximum flexibility, I am hoping to achieve this via set analysis and chart expressions (as opposed to precalculating summary tables and or running balances in the load script, as this brings with it other complexity, and reduces overall fexibility) if at all possible. 

I have put together a grossly simplified example, but I am hoping it should be enough to demonstrate the problem.  The sample contains data for a year, where there are values for each month.  These are displayed in a pivot table where the periods are pivoted across the top.  There are two expressions on the pivot to display the current month value, and the cumulative value (which I have calculated with rangesum.

All Periods.png

The next part of the puzzle is providing users the capability to filter specific periods off the chart, but whilst maintaining the appropriate running balances.  If a user selects specific months (E.g. Mar, May, Dec) they should see the chart filtered for those months.  The expression values from the chart above should essentially remain the same, but only the selected months should be displayed (see below for example of what I am aiming for).

Filtered Prior Periods.png

I managed to get this working for a single contiguous range of dates being selected using a combination of RangeSum and Set Analysis (see below), but now need to take it to the next level to support non contiguous date range selections.

=RangeSum(before(sum(Value),0,ColumnNo()))

+

alt(

  sum(DISTINCT

  Aggr(NODISTINCT

  sum (

  {1

  <PeriodNumeric={"< $(=Min(PeriodNumeric))"}>

  }

  TOTAL <Region, SubRegion>

  Value)

  ,Region, SubRegion)

  )

,0)

I know this is achievable by pre calculating the data in the load script, but I am trying to avoid this to maintian maximum flexibility in the UI to support ad hoc requests.  Any ideas or suggestions on how I might approach this using set analysis and or expressions?  Hopefully I'm missing something really simple...  Thanks in advance for your help.

Cheers,

Graeme

1 Solution

Accepted Solutions
Not applicable
Author

Hi Stacy,

The basic sample I was playing with is attached.  The way this has been implemented will work for small data sets, but for scalability and performance I would strongly recommend decoupling the mapping table from the core data and joining to the mapping table in data model (this is what I ended up doing on our real data, as I found performance degraded when heading in to the millions of rows with the attached structure).

Regards,

Graeme

View solution in original post

12 Replies
Gysbert_Wassenaar

Why not simply add a field CumVal to the table and use that field in your pivot table?

Add this line to the load script just above the INLINE keyword:

,if(RowNo()=1 or Region&SubRegion<>previous(Region&SubRegion),Value,Value+peek(CumVal)) as CumVal

If your table isn't ordered by Region,SubRegion and PeriodNumeric you'll have to do that first so the CumVal restarts correctly on changes of Region and SubRegion.


talk is cheap, supply exceeds demand
Not applicable
Author

Hi Gysbert,

Thanks for the suggestion. I am aware of how to solve this problem in the load script, but as I mentioned in the post, my requirement here is to do this using set analysis and or expressions. 

The example I provided above is grossly over simplified.  In the real data, there will be many underlying transactions, which will relate to various levels within the data hierarchy.  If I go down the path of pre calculating the opening balances within the load script, I need to build a summary table (as otherwise which of the transactions does the opening balance get applied to?).  This will complicate the data model and overall design, which I am trying to avoid if possible.

I am interested and curious to see if I can find a way to do this using set analysis and or expressions (rather than in the load script).  Set analysis works perfectly for the calculation when the periods selected are contiguous, however I fear it may just be a limitaition of the way set analysis works, and that it may not be possible to run the logic on a "per period" basis across the pivot, but I'm hoping a guru out there can prove me wrong!

Thanks for the suggestion though Gysbert!

Regards,

Graeme

Not applicable
Author

I am thinking that it might be possible with a mapping table of some sort -

E.g. cross join the periods, and then have a flag to indicate the prior periods and then use that in the set analysis expression and on the pivot.  I will have a play with it and see how complex it gets - the load script approach may end up being cleaner, but I'll give it a try.  Will need to check the impact on performance too.

Not applicable
Author

I managed to get this working with a period map table as described above.  I had to add an extra flag so I could filter for current period, but it seems to be working as expected in my example document.  I will just need to scale it up and see how it performs on production volumes (m's).

Using a period map table.png

slwasson
Partner - Contributor III
Partner - Contributor III

Hi Graeme,

I'm faced with something similar, and also prefer to not do the calculations in the load script because I'm working with budget and actual Gross Sales, Net Sales, Rebates and Exchange Rates.

Can you upload the QVW where you used a period map table?

Thank you and kind regards,

Stacy Wasson

Not applicable
Author

Hi Stacy,

The basic sample I was playing with is attached.  The way this has been implemented will work for small data sets, but for scalability and performance I would strongly recommend decoupling the mapping table from the core data and joining to the mapping table in data model (this is what I ended up doing on our real data, as I found performance degraded when heading in to the millions of rows with the attached structure).

Regards,

Graeme

slwasson
Partner - Contributor III
Partner - Contributor III

Hi Graeme,

Thank you - I understand what you're doing now, and agree that with large data sets there might be performance issues.  I'm working with a very small set of data (less than 20,000 rows) and think this might be a feasible solution.  Again, many thanks!

Stacy

Not applicable
Author

Graeme!

I have some other problem regarding comaring values date by date for dimansions which have random records for every day, for example:

DateDim1Dim2Val
01.01.2012GBF100
01.01.2012USAM120
02.01.2012GBM80

But your idea has perfectly resolved my issue as well! It's brilliant!

tduarte
Partner - Creator II
Partner - Creator II

Hi Graeme,

I need to work out something similar but on my case I need the cumulative sum of the previous 12 months:

Capture.PNG.png

If I use your solution then I'll have to multiply my data up to 12 times which I want to avoid.

Is there an alternative solution just using Expressions?