5 Replies Latest reply: Jun 11, 2014 4:54 AM by Friedrich Hofmann

# How to count No.of distinct Id's?

Hi All,

 Date Id Status 6/1/2014 101 1 6/1/2014 101 2 6/1/2014 101 44 6/1/2014 101 45 6/2/2014 102 2 6/2/2014 102 3 6/2/2014 102 44 6/2/2014 102 5 6/3/2014 103 1 6/3/2014 103 2 6/3/2014 104 3 6/4/2014 104 5 6/4/2014 104 45 6/4/2014 104 9 6/5/2014 105 1

This is my sample input data:

Here i need to count the distinct id's count   Logic is:

1)for one day if one id contains status=44 then "Pick schedule" count is 1 and " Pick not schedule" is Zero.

if the same id does'nt contain 44 then "Pick not schedule" is 1.

2) for one day if one id contains status=45 then "Deli schedule" count is 1 and " Deli not schedule" is Zero.

if the same id does'nt contain 45 then "Deli not schedule" is 1.

3)for one day if one id doesn't contain status 44 and 45 then "Piick not schedule" is 1 and "Pick schedule " is zero

and "Delivery Schedule" is zero and "Delivery not schedule" is 1.

I Need below output:

 Date Status Id Count This for our understanding not to show 6/1/2014 Pick schedule 1 Pick not schedule 0 Delivery schedule 1 Delivery Not schedule 0 6/2/2014 Pick schedule 1 Pick not schedule 0 Delivery schedule 0 Delivery Not schedule 1 6/3/2014 Pick schedule 0 Pick not schedule 2 (id=103,104) Delivery schedule 0 Delivery Not schedule 2 (id=103,104) 6/4/2014 Pick schedule 0 Pick not schedule 2 (id=104,105) Delivery schedule 1 (id=104 Delivery Not schedule 1 (Id=105)
• ###### Re: How to count No.of distinct Id's?

Hi Ashok,

I don't quite understand your problem.

- You have (in the example 6/1/2014) four lines with the same ID, but different stati.

- Status 44 is present once, so "Pick"=1 and "Not pick"=0

- Status 45 is present once, so "Deli"=1 and "No Deli"=0

<=> However, there are two more lines where the status is neither of the two, so here "Not pick" and "No Deli" would
be 1 and the other ones would be 0, so that in the end each would be 1, no?

Do you want those lines counted or not?

Best regards,

DataNibbler

• ###### Re: How to count No.of distinct Id's?

here if one day(6/1/2014) status mark as 44 and 45 then no need to check other status for the same day, take count of "Not pick" and "Not Deli" is zero only.

• ###### Re: How to count No.of distinct Id's?

Hi Ashok,

well, in that case, to do it in one step, I would

- create two more fields, a binary field (1 or 0) - two of those, one for the status 44 and one for the status 45, right?

=> Aggregate the table with the expression  >>> sum([field_for_status_44]) <<< and the same for 45.

=> Mind, you need a GROUP BY clause in the same LOAD where you have these expressions, containing all
fields that you do not aggregate (all but those two summed-up fields)

=> That will give you a table with either 1 or 0 (I assume it cannot be more than one), which tells you if the status 44
or status 45 is present in the lines for that ID.

HTH

Best regards,

DataNibbler

• ###### Re: How to count No.of distinct Id's?

Tx,

so then if id not having both(44 and 45) then also we need to take count for "Not pick'=1 and "Not deli"=1,how we take those count if mark binary 0.

and there is chance to one id getting same status means either 44 or 45 may get twice or thrives  for the same day.

plz send me the code.

• ###### Re: How to count No.of distinct Id's?

Hi,

I'll begin from the end: There can be no confusion if you create two separate fields, one that returns 1 only if 44 was found, otherwise 0 - and one returning 1 if 45 was found, otherwise 0.

- >>> IF([Status] = 44, 1, 0) as 44_is_present <<<

- >>> IF([Status] = 45, 1, 0) as 45_is_present <<<

You can then sum those up, including the product_ID and the day in your GROUP BY

=> That would be >>> GROUP BY [date], [product_ID]; <<< (as the last line of your LOAD in the script; That means
you get one line per day per product_ID))

=> Then you will, in each of the fields, get a 1 (or possibly more) if the respective status is present for one specific
product_ID on one specific day, otherwise you'll get 0

=> Then you can go on based on that.

Best regards,

DataNibbler