Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Differing Answers with and without SUM

Hi,

Although an expert with Excel, I am relatively new to the Qlikview community - but loving the power and speed of the application. However, I am having some strange results that I do not understand . . . and they are throwing the dashboard way off course.

I am selecting several records and based on the number and the volume of a product, I can calculate the volume of product at a particular location - and there could be several products at the same location. But the results are not as I expected - and potentially point to me not understanding the use of SUM exactly.

If I use the following formula: Sum([Stock_Ledger_Detail QTYS]*[Product_Details VOLUME]) I get the incorrect answers for the product volume:

error loading image

I exported the above data to Excel and manually multiplied the Stock Ledger Quantities by the Product Volume to show the right answers:

Location_Details Data SourceSite_Details SITE_CODELocation_Details LOC_STATLocation_Details LOC_CODEStock_Ledger_Detail PROD_CODEStock_Ledger_Detail QTYSProduct_Details VOLUMEAvailable VolumeProduct VolumeCorrect AnswerFactor Bigger
GACware (Unite)B2AMixedVF01220273988280.0060841.10000.34070.1703522
GACware (Unite)B2AMixedVF01220273988300.0060841.10000.73010.182524
GACware (Unite)B2AMixedVF01220273988400.0060841.10000.48670.243362
GACware (Unite)B2AMixedTotal1.10001.55750.5962322.612245


Using SUM, the answers calculated by Qlikview are a factor of either 2 or 4 larger?



If I use the same as above, but drop the SUM portion, then the product volumes are correct, but I lose the sub-totals:

error loading image

Can someone tell me why SUM behaves like this? And yet the Available volume calculation, also using SUM, is smart enough to know that there is a single location and that the volume is not the sum of the three records, but is 1.1CBM?

I am lost and need some enlightenment . . . please.

Thanks
Ian

30 Replies
Not applicable
Author

Nope, not yet . . . I think that I will need to re-look at how I am linking everything together . . . clearly something is not right as mu answers even on simple datasets are out by factors of 2x and 4x???

I know there is a straight forward explanation somewhere . . . just wish someone would enlighten me 🙂

Not applicable
Author

The suggested change to the linking key removed the Synthetic table . . . however, the volumes are still out by factors of 2x and in some cases 4x? I am baffled and desperately need someone to point out what must be a common mistake I have made somewhere?

Come on QV community . . . someone has seen my mistake before?

Cheers
Ian

Not applicable
Author

Hi

I made a count and you can see that you have 2 Product_Details VOLUME in every row and

in some row you have 2 Stock_Ledger_Detail QTYS.

Maybe that can help you to solve the problem with data structure.

Regards

Anders

brenner_martina
Partner - Specialist II
Partner - Specialist II

Hi, Ian,

why do you have more data then before? Upload an example again!

Not applicable
Author

Hi Martina / Anders,

The reason there is more data, was that prior to reducing it, I selected three locations as opposed to only one. The reason for this was that I wanted to give the community a little more data to look at and try and see some trends. I have reduced this to a dataset that includes 4 locations.

I am not sure if I have made progrress, but something that Anders pointed out may be a clue to why this is working as it is. The company is currently in the throws of an application migration. It is not a complete application change, but more a migration from a version of the product that was developed to specifically manage one location to a group version of the application that is used in all other locations around the world.

This means that for things like the the product details, including the product code and product volume, I could be importing the exact same record from each application. As generally, I am importing from each database (Dubai and Group version) and then merging in a QV table. This theoretically implies I could have two identical records in terms of PRIN_CODE and VOLUME . . .

. . .and having applied a constant which I read in when reading records from the Dubai database and a different one when reading in from the Group database . . . this is no longer theoretical . . . it is exactly what is happening. The company cannot purge the data at the application level, as migrated customers need to have their old transactional data maintained for several years in the old system for legal reasons.

This explains the double count for volume as these records exist in both systems - for entirely valid reasons. But so far I haven't worked out why the QTYS is counted as 2 in certain cases.

The migration project is going to be going on for several years I guess . . . so I need to work out a way around these two small hurdles?

One way that would help I think is the following: I read in a manually maintained Excel spreadsheet into a QV table called Principal_Master. This includes the principal code and the WMS (Warehouse Management System). If the principal has been migrated, their WMS will be shown as GACware (Unite) and if not it will be GACware (Dubai). So I know which principal is being managed by which system. However, how can I limit the import from the two databases based on the WMS value in this Principal_Master table as this is not available at the DB level? In other words, only import data related to GACware (Unite) if the PRIN_CODE has a correcsponding WMS system showing GACware (Unite)?

I am continuing to dig around and try and work out the best way of correctly modelling the data . . . will post an update here as soon as I discover anything else. In the meantime, any additional ideas? I have attached a reduced file so you can see the script and advise if you have any ideas.

Thanks
Ian

Not applicable
Author

Okay All QV Community Superstars,

Progress at last . . . I took our small sample and reviewed it against the physical database tables and the data held in them . . . and the current QV worksheet is spot on. What is being modelled here is a warehouse with hundreds of thousands of locations and millions of potential products. Products are put away into locations and picked on behalf of our customers.

What eventually happens depening on what products you have picked, is that you get multiple lines in the database describing largely the same products. So when we had a line showing three (3) items left of a particular product in a particular location, it was quite feasible for there to be multiple lines of the same product but with different amounts left - which is what we had seen in the worksheet. Where COUNT was showing more than one for example a COUNT of 3 for a product count of say 19 - this was essentially describing THREE (3) lines in the database where of there was 19 item of the product remaining in the location.

So the counts and the product volume seem correct, although I have now exported a large pivot table out ofthe database for one of the stock controllers to do some data quality control work on.

Another issue is that they are in the midsts of a large customer migration, so over 40,000 locations are briefly live in both systems. This is temporary while they transfer stocks across from the old system to the new system - and this will be shut down as soon as the migration is completed.

Two reasonably simple questions:

1. Is there anyway in a pivot table to see a count of lines for any particular column?

2. Is there anyway to limit the display of data based on the results of an expression? For example when we counted QTYS, could have only displayed those that were greater than 1?

So very much appreciate your continued support.

Regards
Ian

Not applicable
Author

People,

I also need to know if it is possible to to do the following:

I have an expression that calculates volume utilisation by essentially taking the volume of a product in a location and dividing it by the total available volume of the location and multiply it by 100. Effectively percentage utilisation.

When the pivot table calculates a subtotal, how do I tell it to sum the product volume in all locations and divide by the sum of available volume of all locations? Instead of summing the records above?

From this example spreadsheet, QV correctly calculates the 4.5cbm as it can see that the first two lines are for the same location and it does not double count them. The product count and the used volume are also correctly counted, but the Utilisation which should be 66.89 ((3.01/4.5)*100) is incorrectly calculated as the sum of the lines above to give 200.67.

Any ideas?

Thanks
Ian

brenner_martina
Partner - Specialist II
Partner - Specialist II

Hi, Ian,

there is a very strange function: Dimensionality() for rows and SecondaryDimensionalty() for columns, that means, that it calculates a number, that shows you in wich partial sum level you are. Than you can enter an if-statement like:

If(Dimensionality()=0,Sum(Field)/Sum(Total Field),Sum(Field))

or so, for percentage you can enter at the tab number Percentage!

Not applicable
Author

I am a little confused.  Are you still having the problem with results that are either 2x or 4x what they should be when using the sum function?

What version of Qlikview are you using?

Do you use Publisher and see these incorrect results through the plugin, or from the desktop product? 

I have support working on a case due to a similar problem that I am experiencing.  We are using V9 SR6, Publisher and the plugin. 

We occasionally get 4x summed calculated values in an IF statement in a chart in our production environment when viewing the app from the accesspoint.  We get correct values when viewing the deployed app, at the same time, with the desktop product. 

When this occurs, we stop and start the server service and then the accesspoint provides the correct result.  This problem is sporadic and sometimes we go 10 days before it occurs, and sometimes we go a few days.

Please answer the above questions.

johnw
Champion III
Champion III

Toni Kautto wrote:

The aim for the data model should be to have one key per table, to get simple links between tables.

Just as Martina exemplified, an easy way to resolve the synthetic key is to create a new key-field where the multiple keys are concatentated into one single value.

Brenner.Martina wrote:

at first, you have THREE Syntetic tables!!!! That's wrong, QV does not like them! Combine fields in two tables to create a new key, so that the syntetic tables disapear!

(Edit:  Ack!  I didn't notice there were two pages and that I was responding to old posts.)

I haven't read the thread, so I don't know what sort of data structure Ian needs to solve his problem.  His synthetic keys may be completely inappropriate and causing problems.  He might well need to remove them.

But if so, replacing them 1:1 with manually-created composite keys is worse than useless.  Yes, I know the reference manual tells you to do this.  It's wrong.  Yes, I know your trainer taught you to do this.  Your trainer was wrong.  The synthetic key IS a composite key.  It already IS what you are replacing it with.  The only difference is that with a synthetic key, QlikView does all the hard work for you, saving you trouble, and allowing your load to go faster.  (Also, you cannot refer directly to the synthetic key table or ID - they're purely internal, but you typically don't need to refer to either, so this typically isn't a drawback.)

Why are synthetic keys considered so bad, then?  Because they often appear in bad data models, and this is usually how they are first encountered.  But it is the bad data model causing the problem, not the synthetic key.  And replacing the synthetic key 1:1 with a composite key does nothing to fix the data model, so whatever problems you have will persist.  If you have such problems, you have to fix the fundamental problem in the data model, not just replace a synthetic key with a composite key.

All the synthetic key does is link two or more tables by two or more fields.  There is nothing strictly wrong with linking two or more tables by two or more fields.  It is only a problem if that's a bad way to model your data.

There's more information in the below thread.  I'm linking to the last post, where I fixed the formatting on the first post that got badly corrupted on the move to the new forum.  Basically, start there, then go back to page 1, ignore the first post (it's unreadable now), and read from there.

http://community.qlik.com/message/116838