3 Replies Latest reply: Jul 10, 2013 3:42 PM by Stephen Daspit RSS

    inconsistent counting results in script and export

      Hello. I am observing inconsistent--relative to my expectations--results around counting of unique/distinct values in a field. I need to verify that a table contains no duplicates in a single field. This is in preparation to denormalize/flatten data into a single table. Unfortunately, it will require a great deal of effort to remove prioritary data in order to post a .qvw here. I'm hopeful that my description below is sufficent to explain and diagnose what's going on. I want to know whether I'm validating uniquenss incorrectly or if I've misunderstood QV's handling of counts and uniques.

       

      My expectation is that there should be no duplicate values in a specific field; I have already done all necessary scrubbing. I use NoOfRows to find values in my scrubbed table:

       

      LET vCountPrices = NOOFROWS('Pricing');

       

      This tells me there are 743,846 rows (vCountPrices= 743,846). These should be already be unique on the field in question, but I want to validate. To validate that there are no uniques, I first tried to pull all distinct values into their own table and find out how many there are:

       

      Count_ID_x_Price:

           NOCONCATENATE

           LOAD

                DISTINCT [ID_x_Price] AS [Unique]

           RESIDENT Pricing

      ;

       

      LET vUniquePrices = NOOFROWS('Count_ID_x_Price');

       

      This results in vUniquePrices = 738,862. I already have a problem!  738,862 !=  743,846, so the raw numbers are telling me that I have duplicate values in my original table! Interestingly, when I create a table box on Pricing and export to .csv, I get 738,862 lines. This tells me that NOOFROWS('Pricing') is giving inaccurate data... the QV formula seems to reflect more rows than are actually there! When I put a statistics box on this field, I get text and total count of 743,846; the null and missing counts are both 0. Can anyone shed light on what's going on here?

       

      As a sanity check, I tried a different way of counting Pricing using fieldvaluecount. This returns 738,862 (v2ndCountFV = 738,862).

       

      LET v2ndCountFV = FieldValueCount('ID_x_Price');

       

      Based on all of this observed behavior, I can be fairly confident that there are 738,862 records--all unique--in my Pricing table. I think the 743,846 is a false value. I'm trying to understand what/why I get that bad value. Help!