We have a database with a lot of duplicate records. I am running a report that identifies the first create date of the record and counts how many of these records were real new records created recently. I'm using the formula below but I'm not getting result instead getting error in expression.
Both sum() and min() are aggregation functions. You cannot nest those without an intermediate call to aggr() that sort of unaggregates your data in a virtual table.
How is the uniqueness of your rows defined? Using "primary keys"? Using a series of dimension fields?
Then the easiest way to find the number of superfluous records per email address is like this:
All Email addresses that do not have duplicates will disappear from your table because of having a zero expression result. Note that you don't need aggr() for this to work.
A great text on aggr(): QlikView Technical Brief - AGGR
Thanks Peter. IT looks like I overcomplicated the approach. Getting a count of email will give me those that are brand new.
The only thing missing here will be identifying those that were recently created. As the count(email) will not consider when the record was created -- it could be years ago or it could be last week.
I thought of this... you think it's right? Looks like it worked.
I'm not sure I understand where you are heading.
The expression you posted counts all occurences (even those without duplicates) as long as the first one was created after 5/1/2017. Do you think that's right?
That is correct.
The first solution you provided removes all unique emails and identifies all the ones with duplicates -- which answers another use case for me. 😃