Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
hi,
i have a data input about products, each with a number of product versiones.
each version has a status (enabled, disabled, deleted) and a product that had a deleted status in one of its versions cannot move to enabled in a later version.
now, im trying to calculate the total number of distinct products that are enabled.
calculating count (distinct product id) and filtering on status will not be good since if a product is deleted at its latest version, it still had an earlier version enabled so the filter will keep the earlier versions and the count distinct will count the product though it shouldnt....
data looks somthing like this:
product id | product version id | status |
---|---|---|
123456 | 1 | enabled |
123456 | 2 | enabled |
123456 | 3 | enabled |
456789 | 1 | enabled |
456789 | 2 | deleted |
789000 | 1 | enabled |
789000 | 2 | enabled |
so, for the above sample my count should return '2' since second product was deleted on its second version but a regular coun distinct will return 3.
i was thinking about somthing like selecting the max version per product and for that filter out the deleted and only than count the products.
any ideas?
Hi,
An expression like the following will work:
=Count(Aggr(Only(If(Aggr(FirstSortedValue(status, -[product version id]), [product id]) <> 'deleted', [product id])), [product id]))
However, it's quite tricky and depending on the number of rows your application is handling, it may take a long time to calculate. So, you rather get the value in the script, using Peek() and Previous() functions:
Original:
LOAD [product id],
[product version id],
status
FROM
Distinct.xls
(biff, embedded labels, table is Sheet1$);
Final:
LOAD *,
If([product id] <> Previous([product id]), If(status = 'deleted', 0, 1)) AS ProductCheck
RESIDENT Original // do not use resident loads with large tables
ORDER BY [product id], [product version id] DESC;
DROP TABLE Original;
The expression now is as simple as
Sum(ProductCheck)
Hope that helps.
Miguel
Hi,
An expression like the following will work:
=Count(Aggr(Only(If(Aggr(FirstSortedValue(status, -[product version id]), [product id]) <> 'deleted', [product id])), [product id]))
However, it's quite tricky and depending on the number of rows your application is handling, it may take a long time to calculate. So, you rather get the value in the script, using Peek() and Previous() functions:
Original:
LOAD [product id],
[product version id],
status
FROM
Distinct.xls
(biff, embedded labels, table is Sheet1$);
Final:
LOAD *,
If([product id] <> Previous([product id]), If(status = 'deleted', 0, 1)) AS ProductCheck
RESIDENT Original // do not use resident loads with large tables
ORDER BY [product id], [product version id] DESC;
DROP TABLE Original;
The expression now is as simple as
Sum(ProductCheck)
Hope that helps.
Miguel
Hi,
I might have misunderstood your question but would something like this solve your problem?
Count({$<status={enabled},[product version id]={$(=Max([product version id]))>} DISTINCT [product id])
niromo,
i think your solution will not be good enough since it compares the max version id, not taking underconsideration that each product has a different max version id.
miguel,
thank you for your reply. i'll give it a try and let you know what i come up with...