Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
rlp
Creator
Creator

How to explain this strange behavior ?

For data quality reasons, I make extensive use of some expressions like aggr( count( DISTINCT <value> ) , <dimension> ).

Sadly, I found that the behavior of QV differs according to the number of tables.

In the following qvw, you have two cases:

- firstly, when you select 2 for nb_of_tables, you have a first table without duplicates and a second one WITH duplicates. The result is that QV realizes the cartesian product of the two tables and my expression detect the duplicates.

- secondly, when you select 3 for nb_of_tables, you still have the two preceding tables plus a third one WITH duplicates. This time, the cartesian product isn't realized and my expression doesn't detect the duplicates.

This creates two problems :

- How can I ensure data quality by detecting duplicates ?

- I already found this strange behavior while following the "Advandced Topics" training and the instructor said me that QV was using keys and not tables. That seems a bit confusing for me. Could someone clarify the internals ( insofar as possible ) of QV ?

Please excuse my hesitating english.

Thanks for clarifying.

0 Replies