Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
a quick one, where I´m uncertain:
I simply want to sum up two fields inculding (same) set analysis. Normally I do it in this way:
sum( {<[Date.Cal.Year]={2017}>} [FieldA]) + sum( {<[Date.Cal.Year]={2017}>} [FieldB])
Now I saw that this also works:
sum( {<[Date.Cal.Year]={2017}>} ([FieldA] + [FieldB]))
FieldA and FieldB are both from the same (facts) table.
Due to I have a very large data set and complex nested formulas I´d prefer the second version, but I´m uncertain if there could be any differences in the result.
Can anybody confirm whether the two versions are identical or otherwise outline the difference?
Thanks, Thomas
Your point b) assumption is correct I believe. Hence, you would get the set analysis data filter benefit even with the rangesum().
Hi,
sum( {<[Date.Cal.Year]={2017}>} RangeSum([FieldA] , [FieldB]))
This will work if Field A and Field B comes from different table ?
If the association is fine, it should work.
The tables are less important in the front end providing they are suitably associated with other.
Sum(RangeSum([FieldA] , [FieldB]) may give odd results if there is not a one to one relationship between A and B within the context of any chart/table dimensions. If for any value of FieldA, there is more than one possible value of FieldB, then the RangeSum() may return only the value of A. In this case, you might be better off with the slightly less efficient Sum(A) + Sum(B) approach.
>>if there is not a one to one relationship between A and B within the context of any chart/table dimensions...
Or it may double up the values of A....
For example with these two tables
ID | A | B |
---|---|---|
1 | 1 | 100 |
1 | <null> | 100 |
2 | 1 | 50 |
2 | 1 | 200 |
2 | 10 | <null> |
3 | 2 | 100 |
3 | 3 | 100 |
3 | 4 | 100 |
3 | 4 | 101 |
3 | 4 | <null> |
3 | <null> | <null> |
4 | 1 | 300 |
4 | 4 | 400 |
ID | C |
---|---|
1 | 100 |
1 | 200 |
2 | <null> |
3 | 200 |
4 | 300 |
4 | 400 |
Give these results:
Table1:
LOAD ID,
If(A = '*', Null(), A) as A,
If(B = '*', Null(), B) as B
Inline
[
ID, A, B
1, 1, 100
1, *, 100
2, 1, 200
2, 1, 50
2, 10, *
3, 2, 100
3, 3, 100
3, 4, 100
3, 4, *
3, 4, 101
3, *, *
4, 1, 300
4, 4, 400
];
Table2:
LOAD ID,
If(C = '*', Null(), C) as C
Inline
[
ID, C
1, 100
1, 200
2, *
3, 200
4, 300
4, 400
];
Hi,
just to add to this thread, as a general rule (I'm not sure if it's stated somewhere in help/documentation, etc) you should not use sum(A*B) or sum(A+B) if the fields belong to different tables. I haven't tested it, but in most cases you get ZERO if you try to do it.
But even if you get a result it won't be necessarily right. In other words it would be like an "unsafe" use of the sum in the form sum(A+B)/sum(A*B).
Like it's been said, you should take care of null values because you'll have different results.
Also sum(A) + sum(B) does the sum taking as reference totals, while sum(A+B) will do it record per record (similar to what sumproduct does in Excel). This is the explanation of why it won't work correctly if you try sum(A+B) when fields belong to different tables (also remember, even if you concatenate 2 different tables into one, they are still 2 tables)
so in conclusion you should never take sum(A) + Sum(B) and Sum(A+B) as equivalents,
hope this helps,
regards
Hi,
thanks for your post. You wrote:
also remember, even if you concatenate 2 different tables into one, they are still 2 tables
I can´t really imagine this. Can you share the explanation why this should happen?
>>even if you concatenate 2 different tables into one, they are still 2 tables
I don't think that is true. Perhaps you could offer some evidence as to why you believe this.
I gues what Jaime was referring to is a situation when you concatenate two tables into one and different fact fields come from different tables.
Table:
LOAD
ID,
A
FROM Source_1;
Concatenate(Table)
LOAD
ID,
B
FROM Source_2;
Effectively the result is one table, but with NULLs on each row:
ID | A | B |
---|---|---|
1 | 100 | NULL |
2 | 200 | NULL |
3 | 300 | NULL |
1 | NULL | 100 |
2 | NULL | 200 |
3 | NULL | 300 |
This really should be treated as if you were working with two separate tables (from aggregation perspective), so use RangeSum() or other approach to account for null values.