Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
All - I have what I hope is an easy question. Though I've hit a wall. I have a rather simple set of data:
Document_ID | Keyword | Major | Term |
1234 | Test | Major | Side Effect |
1234 | Test | Minor | Side Effect |
4567 | Test 2 | Major | Something Else |
4567 | Test 2 | Minor | Something Else |
4567 | Test | Major | Side Effect |
1111 | Test 3 | Major | Side Effect |
2222 | Test 3 | Major | Side Effect |
3333 | Test 3 | Major | Side Effect |
4444 | Test 3 | Major | Side Effect |
5555 | Test | Major | Side Effect |
5555 | Test | Minor | Side Effect |
One document could have many keywords that are either major or minor (or have the same keyword marked both ways). The keywords are grouped into Terms.
I'm trying to create a sheet that shows a stacked bar counting the keywords by distinct document ID for only the Term "Side Effect".
If I do this as a regular bar chart with Keyword as a dimension and a measure of:
Count({$<Term={"Side Effect"}>} DISTINCT Document_ID)
This works great. I'm sorting by keyword in descending order. "Sort by expression" using the same formula as the measure.
As soon as I add another dimension (Major) and stack it I get unexpected results:
I'm using the same measure and sort formula. I would have thought that "Test 2" would be excluded since there is none with a Term of "Side Effect". The sort is also off as it's counting the distinct Document ID by Major AND Keyword now and appending them together.
I'm guessing this is a misunderstanding of set analysis on my part. Any advice or direction appreciated.
Source data and QVF attached.
Thanks,
Erik
Well I think I solved my question by sorting on expression by:
sum(aggr(Count({$<Term={'Side Effect'}>} DISTINCT Document_ID),Major,Keyword))
I'm trying to find holes in this.
maybe try this:
Count( TOTAL {$<Term={"Side Effect"}>} DISTINCT Document_ID)
Thanks Dan - unfortunately that just seems to ignore the dimensions and retrieve total data (included data other than "side effects").
Hi Erik,
uncheck this
Regards,
Antonio
And use dan.sullivan expression (with TOTAL) only in sort Keyword (sort by expression)
Thanks Antonio! That takes care of the unnecessary dimension value. Appreciate it.
But the sorting is still off. I am using the same formula to sort as the measure. But it's not sorting descending as expected. According to the help, "sort by expression" seems to be the way I want to go here:
In bar charts with multiple dimensions, sorting is locked to the first dimension. This dimension is what groups and stacks are based on, and sorting on a different dimension or a measure would break up these groups in an undesirable way. If you still want to sort by the measure value, try using the option Sort by expression on the first dimension under Sorting.
So I have
Count({$<Term={"Side Effect"}>} DISTINCT Document_ID)
as my expression for "sort by expression". This works if I have a non-stacked bar chart with keyword. But as soon as "Major" is introduced the counts change but the sorting doesn't.
Any thoughts on that piece?
Save
now can you use my sort expression formula from above?
Count(TOTAL {$<Term={"Side Effect"}>} DISTINCT Document_ID)
See Attachment.
Hi Dan - thanks. In this limited set of test data, that seems to sort it correctly (though changing it to ascending or descending does nothing - which makes me think it's not quite working as expected).
When I flip over to my large data set with real data in the same structure, that formula unfortunately does not sort correctly. I'm reading up a bit on TOTAL now....
Thanks so much Antonio - while it seems to work in this example (though I don't think it does as if you change the sort order between Ascending and Descending nothing happens) this doesn't sort my larger data set correctly.
I'm pretty stumped on this one. Appreciate your help so far.