Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
datanibbler
Champion
Champion

Trouble synchronizing the figures displayed in different objects ...

Hi,

I have, in an app I am currently building, two objects:

- A straight_table chart and

- a gauge_chart

and I am trying to synchronize them in that they should display as exactly as possible the same figures.

Logical, isn't it?`Sounds easy, too - well, so one should think ...

It is not. The figures were even quite far apart. I have by now removed one calculated dimension from the chart that the gauge did not have and removed the tick "suppress NULL values" for another dimension and realized that the figures came closer.

To do that, I calculated a flag_field in one of my tables to tell whether or not a specific record in the table is linked to any in the other table. I used an EXISTS() function for that to check whether the keyfield used to link the two tables exists in a copy of the keyfield in the other table. That seems to work quite well.

Now, the figures are quite close - the chart shows a final ratio of 71.33 % and the gauge shows 70.2 % which isn't bad - but still not ideal if there is any way to fix it. That can't be rounding errors - well, several rounding errors could stack up to that difference ...

Can you tell me what that could be due to?

Thanks a lot!

Best regards,

DataNibbler

41 Replies
datanibbler
Champion
Champion
Author

Hi Marcus,

I think this is a mathematical thing - I'll ask my uncle, he worked as a physician and he knows it all 😉

The thing is, if I select just three shelf-rows, I get two different results depending on

- whether I take the entire count of tracks and divide it by the total sum of spaces I have

- or I "manually" calculate the average of the percentage_values by adding up all the percentages and dividing them

   by their number - the way you usually go about calculating an average.

Only sometimes the one is higher and then the other.

Every item for itself is correct, so there must be something about calculating the average of percentages that just won't work.

=> Might there be any way I can display something else, not calculated in that same column, but based on the totals of the other columns in the table_chart so I can have the correct number there?

P.S.: What does this formula_option "formula on all values" mean (the radio-box where I can also select the average of the y-values)?`Which formula is applied when I tick that?

marcus_sommer

Hi DataNibbler,

maybe this is helpful: Average – Which average?

- Marcus

datanibbler
Champion
Champion
Author

Hi Marcus,

this was in the back of my head all the time. No, I think not - if I understand Henric's explanation correctly, the thing is actually quite logical - you have to take care of which granularity your data has and make that match the average you want - but that's just for when you calculate it yourself. But I think that has nothing to do with my mathematical problem.

The difference is merely that in the company_database, the tracks are not booked on shelf_lots - those don't exist in the database. Instead they are booked on "BINs" and one BIN does not always correspond to one storage_lot - but through the choice of the key linking the two tables I have already taken care of that.

However, I have put the question to him and I hope he can explain the underlying principle to me. In the meantime, I will rebuild my app which isn't bad because I can probably do one or two things better than I did the first time.

datanibbler
Champion
Champion
Author

Yippieh. Well, I still don't exactly understand why - I never liked mathematics and statistics - but at least I now have it from an expert that this second method of calculation is plain wrong, so I can now with a clear conscience take that display out of the table_chart and just leave my overall usage_gauge.

Best regards,

DataNibbler

datanibbler
Champion
Champion
Author

Hi Marcus,

I think I'm going crazy over this stuff.

I have built it all from scratch now and I think it's better than before. Still I have two very strange phenomena:

- I have tables A (our new shelf_database in Excel) and B (the one from our real database).

When I use a dimension_field from table_A and display a figure from table_A with a sum, the sum is right.

<=> When I use a second dimension_field from table_B, the sum of the figures from table_A is suddenly different.

=> I removed that dimension_field. Never mind, it was not critical, now that seems ok. Must be because the key is

     quite complicated and, to keep things legible step by step, it is built over about 7 PRECEDING LOADs. (do you think that is a problem? It should not ... at least, I have not yet found any items that are not properly linked.

Another thing: I only have dimensions from table_A now. One of the expressions to display is a figure from table_B, the count of boxes on all the shelves and storage_areas. I have a sum on top of that column.

Now, for I don't know what reason, a textbox with the calculations showed me 6 boxes less than my table_chart. I tried redoing the table_chart and got the same as in the textbox (those 6 less) - then I compared the formulas used in the two table_charts, they were the same, I copied the one from my first table_chart - and the figure jumped up 6 boxes ... I redid the chart another time and it was 6 less again ...

I'm at my wits' end ... guess it would be best to go home now 😉

datanibbler
Champion
Champion
Author

Okay,

that second issue seems solved although I don't quite understand why - if I choose "formula on all values" instead of "sum of y values" for the figure on top of the column, I have those 6 boxes less and thus the same figure as in my textbox; So I assume this is the correct figure though I cannot understand why a cound on all values should be different from the sum of all the counts ...

marcus_sommer

Hi DataNibbler,

I don't think that the many preceeding load are a problem. But I 'm not sure about your datamodel with a (like you said) complicated key. It could be very difficult to find here the reason why you have small differences (the bigger one are a lot easier to find and to fix). I would rethink the whole approach in the direction to minimize or even to avoid these complicated keys and to develop a star-scheme. This meant concatenate/joining/mapping the tables together as many as possible - quite often are the efforts lesser then to develop a link- or associated datamodel.

- Marcus

datanibbler
Champion
Champion
Author

Hi Marcus,

yes, I get what you mean. The whole approach is kind of funny ...

The fact that a dimension_field from one table impacts the overall sum of a figure from another table can only mean that there is some issue about the key.

The trouble is that our real company_database does not know shelves or other storage_areas, only Bins. So I faced the challenge of linking our new storage_database to the Bin_names - now, at times, the Bin_name has 3 numeric elements, like xxx-xx-xx, so a shelf_number is the 1st of those numeric elements. At other times, the name of a storage_area is identical to the Bin_name. At other times, the name of a storage_area relates to the first two characters of many Bin_names, or the first four ...

Also, we don't even have all the storage_areas there are in our new database yet, there is still a remainder, because many Bins that goods can be booked onto do not correspond to any physical area, but rather to the status of the goods, and there are at times several Bins that do correspond to a physical area - but only one, so the goods are not moved, but still booked from one Bin onto another.

I cannot make it any easier, the system is not standardized - or there are about 15 different standards, as you prefer ...

I already use mapping_tables to avoid circular references and also to avoid having to hard-code lists of Bin_names in my script's formulas, instead using an Applymap(), and I try to reduce the nr. of tables  by joining. I will keep on trying.

Best regards,

DataNibbler

marcus_sommer

Hi DataNibbler,

yes it's quite difficult to handle or even to correct poor processes and/or poor data-quality. Probably there will be always some gaps but it is really helpful (especially in the long-term) to be able to show that gaps exists and maybe where they are and what caused they.

This meant to make sure that each failure in linking or associating data and all missing data are assigned with a real value like 'NULL', 'MISSING', '#NV' and so on and maybe with adding an additionally number to be able to differentiate them even more. Further to have to have to all data the source from where it comes from and maybe a status like 'ok.' and 'not ok.' and if the amount of data aren't really huge to have to all data recno's and rowno's and to plot then those data into tableboxes. It's a lot easier to understand where the problems come from and also to be able to show it to others.

- Marcus

datanibbler
Champion
Champion
Author

Hi Marcus,

yes, I get you. I have already done things like this in other instances - separating the data where something is amiss and assembling meta-data about issues for presentation is an important part of data_analysis in a way.

The question is, will anybody be interested in fixing this? The Bin_names in our company_database are too basic - in the way that they are at the very basis of many processes. Still, it might be worthwhile going after that - telling about issues and keeping a record of it helps to remove oneself from accountability for consequent inaccuracies ...