Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
aj_954
Contributor III
Contributor III

Re: Trouble plotting a scatter plot using an indicator

thanks for the reply.

 

Let me know if you can help with this.

 

ok lets say i wanted to show 1000 points out of my 11,000 so I do not have a denisty plot....

what would I have to do to limit it to 1000 points? 

In my data set over 99% of the data is linked to "cluster 1", lets say I only wanted to take a random sample of the data linked to cluster 1, lets say I want 90% of my 1000 points to be from cluster 1 (random sample) and the rest to be from the remaining clusters. 

Any idea on the function that I would write? 

1 Solution

Accepted Solutions
stevejoyce
Specialist II
Specialist II

Hm, i'm not sure this is the best approach but here's a try, i think this should work.

It's not necessarily random, unless you consider the id field (i am using field: "data_point") random.

 

i am or'ing 2 data sets in the set analysis.

1) limiting to cluster = 1 for the top 90%

2) cluster -= 1

 

But you can build on this below to make it more suitable, i realize your scenario was a specific example.

 

 

x-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} y1)

 

y-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} x1)

View solution in original post

1 Reply
stevejoyce
Specialist II
Specialist II

Hm, i'm not sure this is the best approach but here's a try, i think this should work.

It's not necessarily random, unless you consider the id field (i am using field: "data_point") random.

 

i am or'ing 2 data sets in the set analysis.

1) limiting to cluster = 1 for the top 90%

2) cluster -= 1

 

But you can build on this below to make it more suitable, i realize your scenario was a specific example.

 

 

x-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} y1)

 

y-measure:

sum({<data_point={"=rank(max({<cluster={1}>} data_point)) < count(total {<cluster={1}>} data_point) * .9" }> + <cluster -= {1} >} x1)