Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Qlik aficionados,
I am currently trying to implement the Iris Cluster example in Qlikview.
Basically, it consists of an input dataset with the following fields: Observation, Sepal Length, Sepal Width, Petal Length, Petal Width, Iris Species.
I need to call the R function kmeans in order to group irises into 3 clusters, based on the information about petal and sepal dimensions.
I created a table chart, using Observation as dimension, and
R.ScriptEval('kmeans(cbind(q$petLen, q$petWid, q$sepLen, q$sepWid), 3, nstart = 20)$cluster', petLen, petWid, sepLen, sepWid)
as the expression. This works wonderfully!
Now, I want to create a scatter chart, assigning a color to each observation, according to the cluster it belongs to.
So, I set Observation as dimension, Petal Length as first expression (x-axis) and Petal Width as second expression (y-axes). Then, on the x-axis background properties, I set the following rule:
if (R.ScriptEval('kmeans(cbind(q$petLen, q$petWid,
q$sepLen, q$sepWid),3, nstart = 20)$cluster', petLen, petWid, sepLen, sepWid)=1, rgb(200, 12, 45)).
For simplicity, here I avoided nested IFs. For the moment, the idea is to set the color for the Observations belonging to cluster 1.
Doing this, I get the error message "Allocated memory exceeded" on the scatter plot, while SSEtoRserve says "more cluster centers than distinct data points".
My explanation of this is that Qlikview is passing the data to R one record at a time. Therefore, R can't calculate 3 clusters out of a single line of data. This is also confirmed by the fact that if I set the clusters variable to 1, it works.
So, I think that the correct question is: How can I pass my variables to R as a whole? I mean, just like passing the entire variable array, and not the points one by one.
And also, why is the same function working correctly with the table chart?
Sorry for the length of the post, but I hope this is clear. I am rather new with Qlikview, and I am aware that it is very likely that I am missing some importan and basic detail about the way Qlikview uses data.
Thanks!
hi
attach is a sample with a work around for your problem
i encountered the same issue , will test it later
by the way why do you use qlikview and not qlik sense?
hi
attach is a sample with a work around for your problem
i encountered the same issue , will test it later
by the way why do you use qlikview and not qlik sense?
Thank you very much Liron!
This is exactly what I was looking for!! However, I guess I have to learn more about scatter plots now.. I don't understand what the expression "size" is doing. If I change the number in its definition, I can see the colors on the plot changing position between clusters, while if I remove it, the dot sizes of the clusters are affected. Also, this expression looks different from the others, as its display options are different, though not changeable.
Regarding your question, I am doing this exercise because I want to implement cluster analysis in a pre existent Qlikview project.
size
is setting the bubble size you can actually use a third expression for this
or a static number like i did so all symbols have the same size