Skip to main content
Announcements
Global Transformation Awards! Applications are now open. Submit Entry
cancel
Showing results for 
Search instead for 
Did you mean: 
vdcastro
Partner - Contributor III
Partner - Contributor III

set analysis with expressions

Hello everyone,

i am trying to solve a problem with no sucess.

i have this dataset

CodControlProblemDate
COD01AID110/10/2016
COD01AID210/10/2016
COD01BID114/10/2016
COD01B
ID211/10/2016

and i want to count the problems that are not closed after 3 days in point A.

Ex: i have 1 problem alive (14/10/2016 point B) 3 days after 10/10/2016

i want to create a table with this output:

CodProblems
COD0183950
COD0253882
COD0377741
COD0477354

but my current result is:

CodProblems
COD01450
COD02288
COD0319761
COD0477354

my expression is:

=count({< Control= {'B'}, Date={">=$(=(max( {$< Control={'A'}>}date)+3 ))"}>} Problem).

the problem i think i have is that my Date field is with the value of the date of my last row (COD04) because the last row is correct.

I think if i force the Cod on this expression, i obtain the correct result in the table.

Is this possible? something like

=count({< Control= {'B'}, Date={">=$(=(max( {$< COD_BY_DIMENSION, Control={'A'}>}date)+3 ))"}>} Problem).


Thanks by your help

1 Solution

Accepted Solutions
johnw
Champion III
Champion III

The quick answer is that set analysis expressions cannot be sensitive to dimensions so you can't do it the way you're trying to do it. There's technically a way around that, but it's hideously complex, and we don't need to go there for this problem.

We need to check the dates for each problem in each specific Cod. That's an aggr() by Cod and Problem.

aggr(...something...,Cod,Problem)

Now, what do we need for the "something"? It would need to return a count of 1 or a single problem ID if for that specific problem, the B point is > 2 days past the A point. So we want to say something like this:

if(Control B Date - Control A Date > 2, 1)

It would be good to use set analysis to identify the control B date and control A date of interest, in the name of performance. But didn't I just say we couldn't do that? Well, sort of, but not exactly. It's the set itself that is only evaluated once, but we can use a set expression where that's just fine. We need an aggregation function to use set analysis, so just in case there are multiple dates, I'll go with max() or min(). So I figure something like this:

if(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2, 1)

And I tend to "cheat" with my Boolean algebra in a case like this. I know that an expression that is true evaluates to -1, false to 0. So I can rewrite the above as:

-(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2)

I know, I know, it doesn't save more than a couple characters, and is harder to read, but I'm just so used to seeing them that way. Do whichever makes you more comfortable. Anyway, we plug that into our aggr:

aggr(-(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2),Cod,Problem)

But we aren't done, because the aggr() will return multiple rows for each Cod. We want the count of those rows with a 1 on them. Easy enough, just sum, and that will give us that count.

sum(aggr(-(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2),Cod,Problem))

And I typically put the - on the outside in a case like this, and then I doubt we'll need the inner parentheses around our Boolean expression:

-sum(aggr(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2,Cod,Problem))

I might have it wrong, but if I've understood your question and not made a mistake, that should work. That said, a more understandable version might read like this:

count(aggr(if(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2,Problem),Cod,Problem))

Same basic idea, but now we're counting problems instead of negative summing a boolean expression, which is probably more clear. Clarity is good, and I have an unfortunate tendency to sacrifice it on the altar of brevity. Mea Culpa.

View solution in original post

3 Replies
sunny_talwar

From what you have mentioned above, it seems that set analysis might not work for you. May be you need to use Aggr() function. But I am not 100% sure. Would you be able to provide a sample document with the expected output based on the sample provided?

johnw
Champion III
Champion III

The quick answer is that set analysis expressions cannot be sensitive to dimensions so you can't do it the way you're trying to do it. There's technically a way around that, but it's hideously complex, and we don't need to go there for this problem.

We need to check the dates for each problem in each specific Cod. That's an aggr() by Cod and Problem.

aggr(...something...,Cod,Problem)

Now, what do we need for the "something"? It would need to return a count of 1 or a single problem ID if for that specific problem, the B point is > 2 days past the A point. So we want to say something like this:

if(Control B Date - Control A Date > 2, 1)

It would be good to use set analysis to identify the control B date and control A date of interest, in the name of performance. But didn't I just say we couldn't do that? Well, sort of, but not exactly. It's the set itself that is only evaluated once, but we can use a set expression where that's just fine. We need an aggregation function to use set analysis, so just in case there are multiple dates, I'll go with max() or min(). So I figure something like this:

if(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2, 1)

And I tend to "cheat" with my Boolean algebra in a case like this. I know that an expression that is true evaluates to -1, false to 0. So I can rewrite the above as:

-(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2)

I know, I know, it doesn't save more than a couple characters, and is harder to read, but I'm just so used to seeing them that way. Do whichever makes you more comfortable. Anyway, we plug that into our aggr:

aggr(-(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2),Cod,Problem)

But we aren't done, because the aggr() will return multiple rows for each Cod. We want the count of those rows with a 1 on them. Easy enough, just sum, and that will give us that count.

sum(aggr(-(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2),Cod,Problem))

And I typically put the - on the outside in a case like this, and then I doubt we'll need the inner parentheses around our Boolean expression:

-sum(aggr(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2,Cod,Problem))

I might have it wrong, but if I've understood your question and not made a mistake, that should work. That said, a more understandable version might read like this:

count(aggr(if(max({<Control={'B'}>} Date) - min({<Control={'A'}>} Date) > 2,Problem),Cod,Problem))

Same basic idea, but now we're counting problems instead of negative summing a boolean expression, which is probably more clear. Clarity is good, and I have an unfortunate tendency to sacrifice it on the altar of brevity. Mea Culpa.

vdcastro
Partner - Contributor III
Partner - Contributor III
Author

Thank you by your answer.

I understood your logic and your explanation. Amazing

Thank you very much