Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
We need to use the logical OR function in the load script: if any condition is TRUE, then Any_Condition is TRUE as well:
INPUT
Time | Condition |
---|---|
1 | 0 |
2 | 1 |
3 | 0 |
3 | 0 |
4 | 0 |
4 | 1 |
5 | 1 |
5 | 1 |
OUTPUT
Time | Any_Condition |
---|---|
1 | 0 |
2 | 1 |
3 | 0 |
4 | 1 |
5 | 1 |
So basically we could write like: if(sum(Condition)>0,1) group by Time
But for reasons of performance, we cannot use if statements. and this is why we should use an OR:
1-Product(1-Condition) group by Time
But there is no Prooduct function. How can we get what we want?
Cheers, Jonas
Are you sure this would perform better that the sum with an if() ?
I do not answer your orignal request, but maybe this would work and perform well enough:
Maybe you could try:
INPUT:
LOAD Time as InputTime,
Condition
FROM
[http://community.qlik.com/thread/47967?tstart=0]
(html, codepage is 1252, embedded labels, table is @1);
OUTPUT:
LOAD distinct InputTime as Time, 1 as Any_Condition
Resident INPUT where Condition = 1;
LOAD InputTime as Time, 0 as Any_Condition
Resident INPUT where Condition = 0 and not Exists(Time, InputTime);
edit:
removed typo
Ah, maybe also possible:
INPUT:
LOAD Time,
Condition
FROM
[http://community.qlik.com/thread/47967?tstart=0]
(html, codepage is 1252, embedded labels, table is @1);
OUTPUT:
LOAD Time, max(Condition) as Any_Condition
Resident INPUT group by Time;
Thanks Stefan!
Okay, but there may be an infinite number of concurrent conditions.
Then your way does no longer work, does it?
Cheers,
Jonas
I am not sure what your mean with 'infinite number of concurrent conditions' in terms of 'infinite' and 'concurrent'.
Do you mean yo have several Condition fields you need to take care of? At least the second example should be extendable quite easily.
Just for interest: Is your first approach using the if(sum()) really performing so bad?
Thanks, for pointing this out, I missed your second reply.
MAX is indeed a possibility, right.
But I worry about the performance: is there no other way to make the product of Fields, or a way to perform boolean operations over several fields (AND is easy: make product of FIELD, NOTis also easy: 1-FIELD, but how to make OR)?
We could also use a SUM and if SUM>0, but this would need an IF ...
Hm, what is so bad using a single if per group by value? Using sum should be pretty efficient also.
Haven't done any testings on this, but I can't really believe that this should perform so much worse than using (if that would exist) a Product function on a result got from a subtraction (1-Condition).
Maybe I am wrong here, would need to test.
edit:
Just tested with 10 Million Input rows:
Using two loads / where clause: 59 s
using Max function and group by 72 s
using if(sum(Condition),1,0) / group by 70 s