Skip to main content
Announcements
Live today at 11 AM ET. Get your questions about Qlik Connect answered, or just listen in. SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Inconsistent behaviour of rows when using sum(aggr(if(sum(total<dim> measure)>threshold,1,0),dim))

Hi everyone!

Long time lurker, first time poster.

I've got a completely inexplicable behaviour (as far as I can tell) in a straight table (chart).

I'm trying to assess how many employees have worked in excess of vThreshold hours at any legal employer for each legal employer.

I have a formula that appears to work, however only the 'top line' sum works. If I make selections on Legal Employer, the sum for that individual legal employer is correct. But when there are no selections (or a subset of legal employers selected), I seem to get only the unique(?) employees for each legal employer returned. The total at the top remains accurate for the subset in selection, but each row has a lesser number.

I've tried numerous variations of this, including the obvious Count, and this is probably a terrible version of this formula. There's some underlying flaw in the way I'm running the conditional on Hours by Employee Number and accumulating the result.

I want the count of employees who worked at each legal employer who worked more than vThreshold hours with any employer.

=Sum(  aggr (

IF ( SUM (  {$+<[Legal Employer]=>}  Total <[Employee Number]>    Hours_hrs)  >=  vThreshold , 1 , 0 )

,[Employee Number]))

Obviously I can't share the underlying app and data, but here's a sample of Fact Table and the table I'm trying to produce:

Employee NumberLegal employerDate Hours_hrs
4Company A1/1/199940
2Company A1/1/199910
3Company B1/1/199910
1Company A1/1/199910
3Company A1/1/1999

10

1Company B1/1/199910
2Company C1/1/199910
3Company B1/1/199910
1Company A1/1/199910
3Company B1/1/199910

Legal EmployerAbove threshold (25)
Company A3
Company B2
Company C0
(all companies)3

Thanks for any assistance!

1 Solution

Accepted Solutions
swuehl
MVP
MVP

Another option could be

=Count({<[Employee Number] = {"=Sum({1} Hours_hrs)>=25"} >} DISTINCT [Employee Number])

(using {1} set identifier in field modifier expression seach to ignore e.g.selections on the employer).

edit: and to make your original expression work, you could try

=Count(DISTINCT

  aggr (NODISTINCT

IF ( SUM (  {$+<[Legal employer]=>}  Total <[Employee Number]>    Hours_hrs)  >=  25 , [Employee Number] )

,[Employee Number])

)

But I would rather use Sunny's expression using two aggr() dimensions to solve the dimensional grain mismatch, see also

Pitfalls of the Aggr function

View solution in original post

6 Replies
Anonymous
Not applicable
Author

I ended up solving this by trial and error with the following:

=Sum( aggr(  Count( DISTINCT

aggr( NODISTINCT IF( SUM( Total <[Legal Employer],[Employee Number]> Hours_hrs)>=vThreshold,[Employee Number],Null())

,[Employee Number])) ,[Legal Employer]))

If anyone can provide some insight into how one might write a solution that still retains the correct "all selections" sum at the top (this one provides the sum of non-unique employees across all organisations, so I use the other formula in a message box for the unique total), it would be very much appreciated.

sunny_talwar

Is your threshold 25 in the sample output you shared? How are you getting those numbers? 3 for Company A? and 2 for Company B? I used this

=Sum(Aggr(If(Sum(Hours_hrs) >=  25, 1, 0), [Employee Number], [Legal employer]))

to get this

Capture.PNG

Anonymous
Not applicable
Author

Thanks Sunny! I'm afraid that doesn't do it - I'm not sure I communicated the intended behavior clearly enough.

In the example, you're right, 25 hours (total worked per employee across all orgs) was the threshold. As such, company A should have three employees who worked over 25 hours (4,3 and 1), company B should have two employees who worked over the threshold (3 and 1).

Running the threshold on the sets aggregated by employee number but ignoring legal employer, then counting the presence of the employee numbers exceeding the threshold, aggregated by legal employer, should do the job.

I could never figure out how to get the second nested aggregation working (the outer one, to aggregate the results of the table of employees with total hours>threshold, and map each yes/no to each legal employer the employee worked for). My second solution (in the comments) kinda does this, but I'm not sure it's robust to multiple selections on legal employer. I'll do more work on this Monday and let you know if you're interested?

This is my current "state of the art"

=Sum( aggr(  Count( DISTINCT

aggr( NODISTINCT IF( SUM( Total <[Legal Employer],[Employee Number]> Hours_hrs)>=vThreshold,[Employee Number],Null())

,[Employee Number])) ,[Legal Employer]))

But passing the Employee Number back and forth feels like a hack, and as mentioned it doesn't seem robust for more than one legal employer at a time (fine for the main table, but as described this needs the first formula for the aggregate).

More importantly, I still don't understand why my first formula doesn't return the intended result - anyone who can provide some insight on that front I'll be eternally grateful.

sunny_talwar

How about how this looks

=Count(DISTINCT Aggr(If(Sum(TOTAL <[Employee Number]> Hours_hrs) >=  25, [Employee Number]), [Employee Number], [Legal employer]))


Capture.PNG

swuehl
MVP
MVP

Another option could be

=Count({<[Employee Number] = {"=Sum({1} Hours_hrs)>=25"} >} DISTINCT [Employee Number])

(using {1} set identifier in field modifier expression seach to ignore e.g.selections on the employer).

edit: and to make your original expression work, you could try

=Count(DISTINCT

  aggr (NODISTINCT

IF ( SUM (  {$+<[Legal employer]=>}  Total <[Employee Number]>    Hours_hrs)  >=  25 , [Employee Number] )

,[Employee Number])

)

But I would rather use Sunny's expression using two aggr() dimensions to solve the dimensional grain mismatch, see also

Pitfalls of the Aggr function

Anonymous
Not applicable
Author

Thanks Stefan, and thanks Sunny!

I've marked Stefan correct for the edit, as it's the only version that returns the desired behaviour - for some reason (possibly the dimensions and selections outside of the scope of the toy example I provided) I can't get Sunny's version to work in the live app, whereas Stefan's suggestion of

=Count(DISTINCT  aggr (NODISTINCT  IF ( SUM (  {$+<[Legal employer]=>}  Total <[Employee Number]>    Hours_hrs)  >=  vThreshold, [Employee Number] )  ,[Employee Number]))

did the trick first time, with the correct expression total as desired.

Comparing that with my previous version,

=Sum( aggr(  Count( DISTINCT aggr(NODISTINCT IF( SUM( Total <[Legal Employer],[Employee Number]> Hours_hrs)>=vThreshold,[Employee Number],Null()),[Employee Number])),[Legal Employer]))

it seems like the additional aggregation by Legal Employer was what was ruining the expression total, and I added that one because I was getting the set analysis wrong (the {$+<... part for legal employer wasn't there). The grain mismatch I think is necessary in order to get the right behavior consistently when changing selections on other dimensions, otherwise you need to include every single dimension in a similar way.

Thanks again both of you!