Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi there,
I am trying to make a set analysis for a bit of SQL code which I have done which shows the correct number of distinct customers which have both products.
select count(distinct ID) as Number
from table1 x
where
X.Year= '2016/05' and
(X.Product_1 + X.Product_1 + X.Product_1 + X.Product_1 + X.Product_1 ) > 0 and
( X.Product_2+ X.Product_2+ X.Product_2) > 0 and
( X.Product_3+ X.Product_3+ X.Product_3) = 0 and
( X.Product_4+ X.Product_4) = 0
(If these two products added together are greater than 0 then count as distinct 1 on ID).
The set analysis I have been trying to work with is the following;
COUNT ({< ID=P({<Product_1={">0"}>}) + P({<Product_1={">0"}>}) + P({<Product_1={">0"}>}) + P({<Product_1={">0"}>}) + P({<Product_1={">0"}>})
+P ({<Product_2={">0"}>}) +P ({<Product_2={">0"}>}) +P ({<Product_2={">0"}>})
>} DISTINCT ID)
This gives an incorrect number, should I be excluding the other tables also to get the correct figure, just like my SQL?
Would you be able to expand on the usage of repeating the use of X.Product_1 on first row and repeating products on other rows?
select count(distinct ID) as Number
from table1 x
where
X.Year= '2016/05' and
(X.Product_1 + X.Product_1 + X.Product_1 + X.Product_1 + X.Product_1 ) > 0 and
( X.Product_2+ X.Product_2+ X.Product_2) > 0
( X.Product_3+ X.Product_3+ X.Product_3) = 0
( X.Product_4+ X.Product_4) = 0
Hi Sunny,
The products have their own individual name, product one was just an example (category type).
I am not sure what you are trying to do, would you be able to share few rows of sample (dummy) data with expected output based on the data?
Year | ID | Product 1 - Apple | Product 1 - Orange | Product 1 - Pear | Product 2 - Carrot | Product 2 - Potato | Product 2 -Onion | Product 2 - Celery |
2016/05 | 99999 | 0 | 1 | 0 | 0 | 1 | 1 | 0 |
2016/05 | 10000 | 0 | 0 | 0 | 0 | 1 | 1 | 0 |
Here is an example of dummy data.
Therefore every time product 1 & 2 has at least one true value (1) then it counts ID as distinct 1.
The original code excludes the other two columns on the table to give the correct output.
I have an idea, add flag columns in your load statment
such as if (X.Product_1 + X.Product_1 + X.Product_1 + X.Product_1 + X.Product_1 ) > 0, 1, 0) as Product1Flag,
Do the same to
( X.Product_2+ X.Product_2+ X.Product_2) > 0
( X.Product_3+ X.Product_3+ X.Product_3) = 0
( X.Product_4+ X.Product_4) = 0
then you only need to filter the flag columns to get what you need.