Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
aj_954
Contributor III
Contributor III

Re: Trouble plotting a scatter plot using an indicator

thanks for the reply.

 

Let me know if you can help with this.

 

ok lets say i wanted to show 1000 points out of my 11,000 so I do not have a denisty plot....

what would I have to do to limit it to 1000 points? 

In my data set over 99% of the data is linked to "cluster 1", lets say I only wanted to take a random sample of the data linked to cluster 1, lets say I want 90% of my 1000 points to be from cluster 1 (random sample) and the rest to be from the remaining clusters. 

Any idea on the function that I would write? 

1 Solution

Accepted Solutions
stevejoyce
Specialist II
Specialist II

Hm, i'm not sure this is the best approach but here's a try, i think this should work.

It's not necessarily random, unless you consider the id field (i am using field: "data_point") random.

 

i am or'ing 2 data sets in the set analysis.

1) limiting to cluster = 1 for the top 90%

2) cluster -= 1

 

But you can build on this below to make it more suitable, i realize your scenario was a specific example.

 

 

x-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} y1)

 

y-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} x1)

View solution in original post

1 Reply
stevejoyce
Specialist II
Specialist II

Hm, i'm not sure this is the best approach but here's a try, i think this should work.

It's not necessarily random, unless you consider the id field (i am using field: "data_point") random.

 

i am or'ing 2 data sets in the set analysis.

1) limiting to cluster = 1 for the top 90%

2) cluster -= 1

 

But you can build on this below to make it more suitable, i realize your scenario was a specific example.

 

 

x-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} y1)

 

y-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} x1)