Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Distinct count over several fields

Hi,

I have a PSQL query which combines email addresses from three sources, counting distinct email addresses using two different timestamp fields and several attribute fields to restrict the result.

I have been trying to get the same distinct count result in Qlikview, but I've been unsuccessful so far. Doing this in loader I think would not work as end users need to be able to browse records using the timestamp fields.

I get correct results when counting over a single field. It seems combining the fields is a problem.

Some logic:

Senders may be users, recipients or not either.

Recipient email always has a match in user email in full records, but not necessarily over the time interval under scrutiny.

Users may be senders or recipients or not either.

So doing a simple:

Count distinct recp_email
Count distinct send_email
Count distinct usr_email

all produce correct results within their respective fields.

However, combining:

Count(DISTINCT send_email & '|' & recp_email & '|' & usr_email)

or

Count(DISTINCT send_email & '-' & recp_email & '-' & usr_email)

do not.

I also tried nested ifs where the logic should be correct.

// users who are not senders or recipients
Count(DISTINCT if(not(match(send_email, usr_email))
    ,usr_email
    ,
    if(not(match(recp_email, usr_email))
        ,usr_email
        )))

    +
//recipients who are not users or senders
Count(DISTINCT if(not(match(recp_email, usr_email))
    ,recp_email
    ,

     if(not(match(recp_email, send_email))
        ,recp_email
        )))   

    +
//senders who have not been recipients or users
Count(DISTINCT if(not(match(send_email, recp_email))
    ,send_email
    ,
    if(not(match(send_email, usr_email))
        ,send_email
                )))

ATM I feel stuck with this. I also tried aggr and several other if-approaches, without much success, could be because I am relatively new to QV and couldn't get the syntax right. Any suggestions? I am already familiar with the thread counting distinct based on 2 fields? and several others on the topic.

2 Replies
ramoncova06
Specialist III
Specialist III

to me it seems as you probably might have some null values that are missing up with you count,though I do feel this is one of those cases would we have to see the data in order to provide a solution

Not applicable
Author

Hi Ramon,

I'll try and refactor a sample. The real data-set is very large.

I took a look at the data on a row level, creating a table that had usr_email as the main dimension and recp_email, send_email as plain expressions. Then I added these scripts as expressions and got a right-looking count on row-level.

I also tried checking for count integrity using IsNull() and noted that NullCount did not seem to produce correct results. I have not named or otherwise changed null values in the data, they should be true nulls.

I did note, however, that Qlikview marks values in tabular data as null and seemed to evaluate multiple matches of emails into nulls in tabular context. So, if there were several emails matching send_email (several senders to a single recipient), tabular view would show these as null instead of pointing to this "array". Distinct count of these was, however, returned correct on row level upon closer inspection.

Mostly, I end up having too many distincts, not too few. But I'll try and refactor the loader to get a small obfuscated sample :-).