Solved: Re: Trouble plotting a scatter plot using an indic... - Qlik Community

aj_954 · ‎2021-09-30

thanks for the reply.

Let me know if you can help with this.

ok lets say i wanted to show 1000 points out of my 11,000 so I do not have a denisty plot....

what would I have to do to limit it to 1000 points?

In my data set over 99% of the data is linked to "cluster 1", lets say I only wanted to take a random sample of the data linked to cluster 1, lets say I want 90% of my 1000 points to be from cluster 1 (random sample) and the rest to be from the remaining clusters.

Any idea on the function that I would write?

stevejoyce · ‎2021-09-30

Hm, i'm not sure this is the best approach but here's a try, i think this should work.

It's not necessarily random, unless you consider the id field (i am using field: "data_point") random.

i am or'ing 2 data sets in the set analysis.

1) limiting to cluster = 1 for the top 90%

2) cluster -= 1

But you can build on this below to make it more suitable, i realize your scenario was a specific example.

x-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} y1)

y-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} x1)

View solution in original post

stevejoyce · ‎2021-09-30