Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi !
I have some problems with the "set analysis" functions.
I have a list of "sells" having a date and a value.
[CODE]
Sells:
LOAD * INLINE [
Date, Value
2008-01-01, 10
2008-02-01, 20
2008-03-01, 30
2008-04-01, 40
2008-05-01, 50
2008-06-01, 60
2008-07-01, 70
2008-08-01, 80
2008-09-01, 90
2008-10-01, 100
2008-11-01, 110
2008-12-01, 120
2009-01-01, 130
2009-02-01, 140
2009-03-01, 150
2009-04-01, 160
];
[/CODE]
I would like, for each line, to have 2 more values :
- the sum from the start of the year
- the sum from the 4 past months
Expected values :
2008-01-01, 10 --> 10 | 10
2008-02-01, 20 --> 30 | 30
2008-03-01, 30 --> 60 | 60
2008-04-01, 40 --> 100 | 100
2008-05-01, 50 --> 150 | 140
2008-06-01, 60 --> 210 | 180
2008-07-01, 70 --> 280 | 220
2008-08-01, 80 --> 360 | 260
2008-09-01, 90 --> 450 | 300
2008-10-01, 100 --> 550 | 340
2008-11-01, 110 --> 660 | 380
2008-12-01, 120 --> 780 |420
2009-01-01, 130 --> 130 | 460
2009-02-01, 140 --> 270 | 500
2009-03-01, 150 --> 420 | 540
2009-04-01, 160 --> 580 | 580
At first, I wanted to add dynamically this values while loading, to be able to play with this values as axis.
Here is my first problem : I don't know how to use the "SUM" function in the "LOAD" section.
Is it at least possible ?
(I can use the "peek" function, but it not helps for what I want to do...)
Next, I tryied to calculate those values "on the fly".
I've created a simple table (graph), in wich I set "Date" as dimension, and "=Value" as expression.
I want a to add another expression that says "the sum of Values in the lines having the same year" and onther one that says "the sum of Values from the 4 past months".
I added a field "Year" in my load :
[CODE]
Sells2:
LOAD *,
left(Date, 4) AS [Year]
RESIDENT Sells
ORDER BY Date;
[/CODE]
If I add
=sum({$<Year<={"2008"}>} Value)
=sum({$<Year<={THE_YEAR_CORRESPONDING_TO_THE_YEAR_OF_THIS_LINE}>} Value)
Unfortunately, there's no such thing as THE_YEAR_CORRESPONDING_TO_THE_YEAR_OF_THIS_LINE in set analysis. Set analysis is performed ONCE for the entire chart. You can't define a separate set of data for each line of the chart, which is what you're trying to do here.
Here are two approaches. The first approach does the accumulation in the script. I'm not happy with the four month accumulation, since it assumes you only have one record per month. The YTD logic is more robust. Both fields required that the records be in ascending order by date. The second approach does the accumulation in the chart. It uses a month completely separated from your main data set as the dimension, and then uses IF statements to sum the correct months for each row of the chart.
Got the Sum(Value) per year while loading
Not the most elegant script, but it works
The other part is more work. Perhaps I can set aside some time tomorrow.
Got the Sum(Value) per year while loading
This gives me the sum for the whole year.
The value I want is "the sum from the begining of the year to the current date".
IE :
For the January 2008 month, I want "values of Jan 2008".
For the February 2008 month, I want "values of Jan 2008 + values of Feb 2008".
For the March 2008 month, I want "values of Jan 2008 + values of Feb 2008 + values of Mar 2008".
...
For the Decembre 2008 month, I want "values of Jan + Feb + Mar + Apr + May + Jun + Jul + Aug + Sep + Oct + Nov + Dec 2008".
Perhaps I can set aside some time tomorrow.
Thank you for your help !
Unfortunately, there's no such thing as THE_YEAR_CORRESPONDING_TO_THE_YEAR_OF_THIS_LINE in set analysis. Set analysis is performed ONCE for the entire chart. You can't define a separate set of data for each line of the chart, which is what you're trying to do here.
Here are two approaches. The first approach does the accumulation in the script. I'm not happy with the four month accumulation, since it assumes you only have one record per month. The YTD logic is more robust. Both fields required that the records be in ascending order by date. The second approach does the accumulation in the chart. It uses a month completely separated from your main data set as the dimension, and then uses IF statements to sum the correct months for each row of the chart.
Thank you for this answer.
It shows me exactly what I wanted to do.
I hadn't though to use a "IslandMonth" table to have a separated list of UNIQUE months.
Thank you very much !
I should probably point out some weaknesses of the separate month approach. First, performance. QlikView is essentially looping through the months, then pulling in ALL months of data to compare to each. That's a lot of work for it to be doing, and on large data sets, the report will be noticeably slow. It may perform just fine on smaller data sets. Second, ease of use issues. If you select the month from the report, you aren't selecting the real month, only the fake month. So while that report will change, no other reports will change. Even worse, if you select the real month, the chart will still show multiple months - every month of data that the one month of data will be accumulated into.
I think I have a third approach that handles it in the data model without actually doing accumulations ahead of time. I'll need to work it into the example, but it would be loosely based on the attached approach for showing a previous year's data. The basic idea is that every record has a unique ID, and then you create a table that joins one selected month to the actual records that span multiple months. If I can't get to it quickly, maybe you can work it out with that hint and the example file if the weaknesses of the separate month approach matter in your application.
Here's the original example extended to use the linkage table approach. Using FOR loops is probably the most efficient from a memory point of view, but there's probably a much faster approach using JOIN. I've replaced FOR loops in the past with JOINS because time to load has been more critical around here than memory use. But that's really a separate issue, so I'm leaving it as is since it appears to work.
Thank you for this alternative method.
It seems to be really efficient