Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
bdiamante
Contributor III
Contributor III

Strange aggregation results...

I have a fact table in my database that captures school enrollment data. The table contains the following columns:

id, school_grade_id, gender_id, ethnicity_id, number_enrolled, and enrollment_year

When I sum the number enrolled for a particular school and year, I am getting some inconsistent results. 

When I do a direct SQL query of my database, I get a value of 446 students for this particular school in this particular year, however when I make the same query in Qlikview using the expression:

sum({Sample1<enrollment_year={2013}>}number_enrolled)

and I ensure that I have the correct school selected in the Sample1 state, I get 457.  Investigation has narrowed it down to three particular records that show up in Qlikview as being associations, that are not present in my database. 

After checking problem ID's to my database, those records that are causing the problem are in my database with a completely separate school_grade_id's, and should not be getting associated with the school I currently have selected. I cannot come up with a reason for why this would be happening.

Some more background: I have quite a large data model for this dashboard, so in an effort to resolve this on my own, I started a new document and loaded in only the essential tables to run my aggregation.  In the new dashboard, Qlikview returns the correct result. This leads me to believe there is an issue with my data model. However, the results are correct for all of the other schools I have spot checked thus far, and I cannot isolate any unique attribute of this problem school that would cause this to happen.

See the screen shots below. Any insight would be appreciated.

Capture.JPG.jpg

Capture.JPG.jpg

1 Solution

Accepted Solutions
bdiamante
Contributor III
Contributor III
Author

So the issue here was both simple and dangerous.  I had used several concatenated keys while creating my link table. When creating these keys, I used this method:

key = school_id & grade_id & subject_id & ...

Let's say the values in those various fields I was concatenating were as follows: RECORD1: school_id=1101, grade_id=5, subject_id=2, RECORD2: school_id=110, grade_id=15, subject_id=2.

When concatenated, these two records which shouldn't associate with each other, do associate because their concatenated keys become (mistakenly) the same.  The solution was simply to add a delimiter between each field I concatenated.  This is easy to overlook when creating your keys but can have huge impacts on your results!

View solution in original post

8 Replies
Gysbert_Wassenaar

It's not necessarily the school that's the problem. It could be a table linked to one of the other id fields. If you have a school with incorrect totals you could select that school and then reduce the data: File -> Reduce Data -> Keep Possible Values. That may make it easier to spot the problem table(s).


talk is cheap, supply exceeds demand
Not applicable

Do you know why school grade ID is different in your model compared to database?

bdiamante
Contributor III
Contributor III
Author

This is essentially the problem I'm trying to describe.  That shouldn't be the case, ever.  The same enrollment_fact_id shows two different school_grade_ids in Qlikview and my database... how can this happen?

Not applicable

Do you have any joins in your load?

bdiamante
Contributor III
Contributor III
Author

I create a Link Table to keep the three fact tables I have from associating common dimensions with each other, so I do have some concatenated tables. My script doesn't have any explicit joins though.

Not applicable

As you are saying you have concatenate loads, Can you add a field which carries name of the table, So that you can see from where these records are coming.

bdiamante
Contributor III
Contributor III
Author

So the issue here was both simple and dangerous.  I had used several concatenated keys while creating my link table. When creating these keys, I used this method:

key = school_id & grade_id & subject_id & ...

Let's say the values in those various fields I was concatenating were as follows: RECORD1: school_id=1101, grade_id=5, subject_id=2, RECORD2: school_id=110, grade_id=15, subject_id=2.

When concatenated, these two records which shouldn't associate with each other, do associate because their concatenated keys become (mistakenly) the same.  The solution was simply to add a delimiter between each field I concatenated.  This is easy to overlook when creating your keys but can have huge impacts on your results!

Gysbert_Wassenaar

Nice eh? That's one reason why using the autonumber is a good idea. The other is that fields with autonumber values take up less memory. From Symbol Tables and Bit-Stuffed Pointers:

When creating QlikView scripts, always ask yourself if there is any way to reduce these numbers, to minimize the memory usage. Here are a couple of common cases:

  • You have a long, concatenated, composite key that you don’t need to display. Use Autonumber() and the symbols will take no space in the symbol table. The integer values will instead be calculated implicitly.

talk is cheap, supply exceeds demand