Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I run into one question about how to sum amount.
I have used outer join connect 2 two tables into one.
OrderId Amount TypeID
1 $2 A
1 $2 A
1 $3 B
If I use sum(Amount) expression, then the results will be $7, but you can see row 2 should be removed, I just want to get $2+$3.
I would like some one to tell me I should use which expression.
Thanks.
Joining tables is always tricking especially when both the tables contain partial "Facts"!
I would interpret the case presented as under:
Fact1: Amount field belongs to the 'order detail' table and hence orderid may repeat for every line of items sold under one order.
Fact2: TypeID field belongs to the 'order header' table and "may" represent current status of the order given the business requirement. It is reasonable to assume that there could be a 'current status flag' to identify the orderid and the TypeID. This becomes a useful metric to analyse if the users wants to know how many times the "TypeID" changed for the order!
Recommendation:
1. I agree with Deepak. Do not join the table if the above 2 scenarios is true but only keep 'orderid' as a link field between the header and line table
2. If only the latest 'TypeID' is relevant, use the current status flag to reduce the header table to one unique row each orderid before jonining with the order details table
3 Exercise caution in using 'Distinct' - orderid can have two rows with 'same' amount but belonging to different items!
Hope this helps.
Many times, there is nothing like a 'right solution', it all depends on how the users interprets their data - as facts and dimensions!!
Hi Isaac,
Try this :
LOAD OrderID, Amount, TypeID
RESIDENT ...... WHERE NOT EXISTS (OrderID,Amount,TypeID);
regards, tresesco
HI,
I think you should not join both the tables.
you can link the two tables based on the key field and expression will work fine.
Deepak is right - if you join a details table to a header table, which it looks like you've done, you will get duplication of the totals.
Use keep rather than join, and you should get a better header - details data structure, and the results you're looking for.
Hi,
If all the field of rows same with the existing one then you just Load the table with Distinct keyword, then it will take one distinct row. Means once the outer join completed then make a resident load of final table with ditinct load.
Joining tables is always tricking especially when both the tables contain partial "Facts"!
I would interpret the case presented as under:
Fact1: Amount field belongs to the 'order detail' table and hence orderid may repeat for every line of items sold under one order.
Fact2: TypeID field belongs to the 'order header' table and "may" represent current status of the order given the business requirement. It is reasonable to assume that there could be a 'current status flag' to identify the orderid and the TypeID. This becomes a useful metric to analyse if the users wants to know how many times the "TypeID" changed for the order!
Recommendation:
1. I agree with Deepak. Do not join the table if the above 2 scenarios is true but only keep 'orderid' as a link field between the header and line table
2. If only the latest 'TypeID' is relevant, use the current status flag to reduce the header table to one unique row each orderid before jonining with the order details table
3 Exercise caution in using 'Distinct' - orderid can have two rows with 'same' amount but belonging to different items!
Hope this helps.
Many times, there is nothing like a 'right solution', it all depends on how the users interprets their data - as facts and dimensions!!
Many thanks for your help!