Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
thanks for the reply.
Let me know if you can help with this.
ok lets say i wanted to show 1000 points out of my 11,000 so I do not have a denisty plot....
what would I have to do to limit it to 1000 points?
In my data set over 99% of the data is linked to "cluster 1", lets say I only wanted to take a random sample of the data linked to cluster 1, lets say I want 90% of my 1000 points to be from cluster 1 (random sample) and the rest to be from the remaining clusters.
Any idea on the function that I would write?
Hm, i'm not sure this is the best approach but here's a try, i think this should work.
It's not necessarily random, unless you consider the id field (i am using field: "data_point") random.
i am or'ing 2 data sets in the set analysis.
1) limiting to cluster = 1 for the top 90%
2) cluster -= 1
But you can build on this below to make it more suitable, i realize your scenario was a specific example.
x-measure:
sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} y1)
y-measure:
sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} x1)
Hm, i'm not sure this is the best approach but here's a try, i think this should work.
It's not necessarily random, unless you consider the id field (i am using field: "data_point") random.
i am or'ing 2 data sets in the set analysis.
1) limiting to cluster = 1 for the top 90%
2) cluster -= 1
But you can build on this below to make it more suitable, i realize your scenario was a specific example.
x-measure:
sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} y1)
y-measure:
sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} x1)