Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
 
					
				
		
Hi Everyone,
I am working with Talend Data Quality for Data Profiling and after performing simple statistics on the column, the graph showed as an analysis result is not showing the correct values.
Please find the attached image.
In the image: Row count is 6896, but the rest all values like distinct count, null count, duplicate count, the blank count is not at all summing up to this value.
Is anyone aware of this issue, and how to solve it?
Thanks and Regards,
Ashish Chouhan
 yzhao
		
			yzhao
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Hi,
let me use some data to make it more clear, if the input data as below(the first line is the column name, and we analyze "string_tmp")
id;string_tmp
1;AA
2;AA
3;BB
4;CC
5;
6;DD
7;
8;DD
analyze result is 
Row Count is 8
Distinct Count is 5, distinct values are AA, BB, CC, DD, and blank, this indicator counts the number of distinct rows (like a "SELECT DISTINCT" SQL statement)
Unique Count is 2, unique values are BB, CC, for this indicator count the value which exists only once.
Duplicate Count is 3, duplicate values are AA, DD, and blank, for this indicator count the values which exist more than once.
so the Distinct count contains the Unique count and Duplicate count.
these indicators analyze the values from different perspectives, Row Count doesn't equal the sum of other indicators is a correct behavior
