Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

inconsistent counting results in script and export

Hello. I am observing inconsistent--relative to my expectations--results around counting of unique/distinct values in a field. I need to verify that a table contains no duplicates in a single field. This is in preparation to denormalize/flatten data into a single table. Unfortunately, it will require a great deal of effort to remove prioritary data in order to post a .qvw here. I'm hopeful that my description below is sufficent to explain and diagnose what's going on. I want to know whether I'm validating uniquenss incorrectly or if I've misunderstood QV's handling of counts and uniques.

My expectation is that there should be no duplicate values in a specific field; I have already done all necessary scrubbing. I use NoOfRows to find values in my scrubbed table:

LET vCountPrices = NOOFROWS('Pricing');

This tells me there are 743,846 rows (vCountPrices= 743,846). These should be already be unique on the field in question, but I want to validate. To validate that there are no uniques, I first tried to pull all distinct values into their own table and find out how many there are:

Count_ID_x_Price:

     NOCONCATENATE

     LOAD

          DISTINCT [ID_x_Price] AS [Unique]

     RESIDENT Pricing

;

LET vUniquePrices = NOOFROWS('Count_ID_x_Price');

This results in vUniquePrices = 738,862. I already have a problem!  738,862 !=  743,846, so the raw numbers are telling me that I have duplicate values in my original table! Interestingly, when I create a table box on Pricing and export to .csv, I get 738,862 lines. This tells me that NOOFROWS('Pricing') is giving inaccurate data... the QV formula seems to reflect more rows than are actually there! When I put a statistics box on this field, I get text and total count of 743,846; the null and missing counts are both 0. Can anyone shed light on what's going on here?

As a sanity check, I tried a different way of counting Pricing using fieldvaluecount. This returns 738,862 (v2ndCountFV = 738,862).

LET v2ndCountFV = FieldValueCount('ID_x_Price');

Based on all of this observed behavior, I can be fairly confident that there are 738,862 records--all unique--in my Pricing table. I think the 743,846 is a false value. I'm trying to understand what/why I get that bad value. Help!

1 Solution

Accepted Solutions
Gysbert_Wassenaar

A table box will not show duplicate records. So exporting what's shown in the table box won't give an accurate result if there are duplicate records in your table. The only reasonable explanation is that your table does in fact have duplicate records.


talk is cheap, supply exceeds demand

View solution in original post

3 Replies
Gysbert_Wassenaar

A table box will not show duplicate records. So exporting what's shown in the table box won't give an accurate result if there are duplicate records in your table. The only reasonable explanation is that your table does in fact have duplicate records.


talk is cheap, supply exceeds demand
swuehl
MVP
MVP

My expectation is that there should be no duplicate values in a specific field; I have already done all necessary scrubbing.

What have you done to ensure this?

If ID_x_Price is not a key field, you can create a list box of that field, check show frequency options in properties and sort by frequency desc. This may give you an hint, which values are duplicated (and then maybe also why).

Not applicable
Author

Gysbert,

Correct again! There were indeed lurking duplicates. You also clarified the count function in the table box.

Thanks!