Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I wonder if there is any way to calculate some basic statistics based on aggregated data.
I have the data set as below (full data set in attachment).
In addition to the simple average, I would like to calculate the median, decile distribution, standard dev and the Kurtosis. It would be also great to calculate some more sophisticated average that would ignore outliers observation (maybe the winsor average).
DAYS AFTER DELIVER | SALE COUNT |
0 | 591 224 |
1 | 2 519 994 |
2 | 5 580 213 |
3 | 9 939 836 |
4 | 12 666 755 |
5 | 12 152 089 |
6 | 12 812 327 |
7 | 10 684 022 |
8 | 8 324 704 |
9 | 6 060 070 |
10 | 4 516 300 |
11 | 4 076 047 |
Qlik has some statistical functions you might want to check.
know these functions but they do not work properly on aggregated data. For example on my whole data set I attached in previous post the median of 'DAYS AFTER DELIVER' - Median([DAYS AFTER DELIVER]) is 613 DAYS while in fact the correct value of median is around 8 days. That is because the median does not take into account differences in numbers (count) in 'DAYS AFTER DELIVER' bucket.
Probably the best area to search would be the Design Blog area, here is one post I found on averages that may help with that particular piece:
https://community.qlik.com/t5/Qlik-Design-Blog/Average-Which-average/ba-p/1466654
Here is another one regarding integration with other third-party tools:
Here is the base URL to the area in case you wish to further search on your own:
https://community.qlik.com/t5/Qlik-Design-Blog/bg-p/qlik-design-blog
Regards,
Brett