2 Replies Latest reply: May 20, 2015 4:22 AM by Susanna Neiglick RSS

    Distinct count over several fields

      Hi,

       

      I have a PSQL query which combines email addresses from three sources, counting distinct email addresses using two different timestamp fields and several attribute fields to restrict the result.

       

      I have been trying to get the same distinct count result in Qlikview, but I've been unsuccessful so far. Doing this in loader I think would not work as end users need to be able to browse records using the timestamp fields.

       

      I get correct results when counting over a single field. It seems combining the fields is a problem.

       

      Some logic:

       

      Senders may be users, recipients or not either.

      Recipient email always has a match in user email in full records, but not necessarily over the time interval under scrutiny.

      Users may be senders or recipients or not either.

       

      So doing a simple:

      Count distinct recp_email
      Count distinct send_email
      Count distinct usr_email

       

      all produce correct results within their respective fields.

       

      However, combining:

       

      Count(DISTINCT send_email & '|' & recp_email & '|' & usr_email)

      or

      Count(DISTINCT send_email & '-' & recp_email & '-' & usr_email)

      do not.

       

      I also tried nested ifs where the logic should be correct.

       

      // users who are not senders or recipients
      Count(DISTINCT if(not(match(send_email, usr_email))
          ,usr_email
          ,
          if(not(match(recp_email, usr_email))
              ,usr_email
              )))

          +
      //recipients who are not users or senders
      Count(DISTINCT if(not(match(recp_email, usr_email))
          ,recp_email
          ,

           if(not(match(recp_email, send_email))
              ,recp_email
              )))   

          +
      //senders who have not been recipients or users
      Count(DISTINCT if(not(match(send_email, recp_email))
          ,send_email
          ,
          if(not(match(send_email, usr_email))
              ,send_email
                      )))

       

      ATM I feel stuck with this. I also tried aggr and several other if-approaches, without much success, could be because I am relatively new to QV and couldn't get the syntax right. Any suggestions? I am already familiar with the thread counting distinct based on 2 fields? and several others on the topic.

        • Re: Distinct count over several fields
          Ramon Covarrubias

          to me it seems as you probably might have some null values that are missing up with you count,though I do feel this is one of those cases would we have to see the data in order to provide a solution

            • Re: Distinct count over several fields

              Hi Ramon,

               

              I'll try and refactor a sample. The real data-set is very large.

               

              I took a look at the data on a row level, creating a table that had usr_email as the main dimension and recp_email, send_email as plain expressions. Then I added these scripts as expressions and got a right-looking count on row-level.

               

              I also tried checking for count integrity using IsNull() and noted that NullCount did not seem to produce correct results. I have not named or otherwise changed null values in the data, they should be true nulls.

               

              I did note, however, that Qlikview marks values in tabular data as null and seemed to evaluate multiple matches of emails into nulls in tabular context. So, if there were several emails matching send_email (several senders to a single recipient), tabular view would show these as null instead of pointing to this "array". Distinct count of these was, however, returned correct on row level upon closer inspection.

               

              Mostly, I end up having too many distincts, not too few. But I'll try and refactor the loader to get a small obfuscated sample :-).