
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Calculating ratio from two data sets
I have two tables with sample data:
Table1: 8 million records
Product Number
YearMonth
Revenue
Product Type
Table2: 600,000 records
Product Number
YearMonth
Warranty Cost
Warranty Code
Calculation Required: Warranty Cost / Revenue ratio for Product by Month/Year (in graph) with Product Type/Warranty Code as user selections.
Issues:
- The Chart calculation is slow - 15 second chart refresh on new Blade Server with 8 cores and 64Gb RAM
- There are Product Number entries in Table 2 that don't exist in Table 1 - result of system migration and updates.
- More Product and Warranty attributes to be added to above
- Potentially more records to both tables
At present, the solution utilises a Calendar Island whereby the ratio calculation is:
=sum(if(Table2.YearMonth = Calendar.YearMonth, [Warranty Cost])) / sum(if(Table1.YearMonth = Calendar.YearMonth, [Revenue]))
Questions:
1) Is the use of the Calendar Island best practice?
2) Are there ideas out there to increase efficiency of calculation, therefore users' experience of QV?

.png)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
8 Million rows should be no issue for a server of that spec so you should be able to sort this with not too much fuss.
The best way is to replace those 'IF' functions with Set Analysis - have alook in the help file as it will explain it far better than I can - Set Analysis is a much more efficient way of running calculations when looking at larger datasets (amoungst other uses); I've had examples where using IF statements charts were taking 3mins+ but with SA were instant.
IF functions require the whole dataset to be checked as to whether they meet the condition and as such are very slow when dealing with larger volumes of data.
As for you bad data; I'd look to use a Where clause in your load script in conjunction with the exists() function to only load associated data.
Hope that helps,
Matt - Visual Analytics Ltd

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As suggested, I have played around with the SetAnalysis and utilised the following to calculate Annual Sales in a pivot table:
sum({$ <Cal_YYYY = Trans_YYYY>} Sales)
The expression returns correct values where CAL_YYYY is being filtered, otherwise, it returns blank when no selection is made on CAL_YYYY.
Have I missed something?


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In your case, the best solution would be to avoid Date Island, if possible, and to link the calendar directly, in order to leverage QlikView associative logic.
To fix your Set Analysis expression above, you can use something like this:
sum({$ <Cal_YYYY = P(Trans_YYYY)>} Sales)
Another problem is the fact that you need fields from 2 tables for a single calculation. QlikView needs to match all possible values from one table with all possible values from the other table, prior to calculating the ratio.
As irrational as it sounds, I believe that loading both tables into one (concatenating) could improve performance, because the two parts of the formula would now come from the same table, therefore eliminating the need in resolving the link (or two?) in the run-time. By the way, concatenating those tables into one can also help you eliminate the Date Island, because now you can name the dates with the same Field Name.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Taking on board the suggestion of concatenating into one table, the fields would look as follows:
Product Year Sales WarrantyCost WarrantyCause
XXX 2000 100
XXX 2001 200
XXX 2000 20 AAA
XXX 2000 40 BBB
If AAA was selected from WarrantyCause, WarrantyCost would be 20 for Product/Year Combination. What is the expression to return the Sales of 100 for that Product/Year Combination to calculate WarrantyCost/Sales ratio of 20%?
In addition, if both AAA and BBB was selected, I only need to return 100 for Sales, therefore the WarrantyCost/Sales ratio should be 60%


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
well, this is a bit more complex, and in this case, concatenation might not be the best solution...
Or, if it's still the "least of all evils", you might be able to circumvent the problem by a Set Analysis formula that would disregard the selection of Warranty Cause for the sum(Sales) - something like this:
sum(WarrantyCost) / sum( {<WarrantyCause=>} Sales)
cheers,

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I was to revert to using 2 tables without a calendar island - 1 for sales and the other for Warranty Cost, such that:
Table 1:
Product
Year
Sales
Table 2
Product
Year
Warranty Cost
Warranty Cause
The link between the 2 tables would be Product (can't use Year as there are multiple definitions of Year).
How would I be able to relate the Year fields above, without including them in a join? The purpose is to calculate the WarrantyCost / Sales ratio for each Year, with the ability to filter by WarrantyCause.
Thanks for your assistance so far.


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure what do you mean by "multiple definitions of Year"...
If you need to compare warranty costs to Sales per year, you need to link by Product and Year. This can be done in 2 ways:
1. Let QlikView generate a Synthetic Key that will combine Product and Year together. I'm not a big fan of those, but there is an opinion that it's OK to have a simple synthetic key for a couple of fields.
2. You can generate a composite key by yourself and create what's called a LinkTable (look it up in the forum, there are many descriptions of how to do it). In the case, you control the process of creating the composite key, instead of QlikView.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Multiple definitions: In the warranty fact table, there exist date definitions for Date(Accident) and Date(ClaimMade) and others....
Therefore creating one link with Date dimenision between Sales and Warranty fact tables would inhibit different calculations of dates, limiting effectiveness of Calendar Islands, link tables etc.
In essence, if the same calculation was to be done in Access for example, one would use nested queries where the first query would return WarrantCost by (Year,Month) based on selection of filters, then using this set of ProductNumbers to interrogate the Sales Fact table to return the sales for those Products by (Year/Month). I am just trying to replicate this nested query within QV.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just reflecting, the Calendar Island option would the answer with the exception that my fact tables are huge (8+m rows), therefore diminishes user experience with snappiness of QV. Have gone down this path and this solution is not acceptable.
Any other ideas?
