Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Use an R function in Script

Hi,

It is possible to evaluate  a vector with R.Evaluate() while loading data in script. ?

Let say I have a the Iris dtaset and I want to create a column in script that tells me wich cluster is every record.

Thanks in advance.

Paolo

Labels (2)
8 Replies
Gysbert_Wassenaar

Yes, that's possible.


talk is cheap, supply exceeds demand
Anonymous
Not applicable
Author

Hi Gysbert,

thanks for you're answer.

Here script I'm using:

[Iris]:

LOAD [observation],

[sepal length],

[sepal width],

[petal length],

[petal width],

[iris species]

FROM [lib://data/Iris.csv]

(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

rename table Iris to tmp#Iris;

NoConcatenate

Iris:

load

[observation],

[sepal length],

[sepal width],

[petal length],

[petal width],

[iris species],

R.ScriptEval('kmeans(cbind(q$petLen, q$petWid, q$sepLen, q$sepWid), 3, nstart = 20)$cluster',

[petal length] as petLen,

[petal width] as petWid,

[sepal length] as sepLen,

[sepal width] as sepWid) as kclust

resident tmp#Iris;

drop tabl tmp#Iris;

This is the error message I receive:

Unexpected token: ',', expected one of: 'AutoGenerate', 'From', 'From_Field', 'Inline', 'Resident', 'Where', 'While', ...

Is there a particular syntax to use to use the R.EvaluateScript ?

qhe
Employee
Employee

I am facing the same problem as Paolo.

R file, CSV, QVF and Log files are attached. Hope someone can assist us further.

The script runs OK under R environment:

rm(list=ls())

q<-iris

q[,6]<-kmeans(cbind(q$Petal.Length, q$Petal.Width, q$Sepal.Length, q$Sepal.Width), 3, nstart = 20)$cluster

q[,6]

table(q[,6])

while after run below in QSE, the error shows "more cluster centers than distinct data points".

Tried to remove the header by "names(q) <- NULL", while received the same error message.

Qlik script:

Table1:

LOAD

    Sepal.Length,

    Sepal.Width,

    Petal.Length,

    Petal.Width,

    Species,

    R.ScriptEvalStr('q[,6]<-kmeans(cbind(q$Petal.Length, q$Petal.Width, q$Sepal.Length, q$Sepal.Width), 3, nstart = 20)$cluster;

                        q[,6];',

                      Sepal.Length,

                      Sepal.Width,

                      Petal.Length,

                      Petal.Width

                    ) as Kmeans

FROM [lib://Data for Qlik Sense Demo (qvsrv_qlik)/risi.csv]

(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

Cheers

Qiyu

Anonymous
Not applicable
Author

Hi,

I think the problem lies in the fact that during the load script you're only processing one item at a time, while the kmeans clustering needs all data-points in the cluster to determine the clusters. So you call it once for all data points and you get a cluster index for each datapoint.

So you can run the functions in scripts, but not all R functions are suitable.

Hope this helps.

Regards,

Bas

Anonymous
Not applicable
Author

Hi,

do you have any example of R functions in script?

Thanks

Paolo

Steven_Pressland
Employee
Employee

To see a working example of the iris dataset, have a look at the Advanced Analytics Expression Builder which will write the code for you. There is a youtube video which uses this dataset as an example:

http://branch.qlik.com/#!/project/596f87f186a5cf7ec72e90e9

The purpose of this integration for these types of examples is to process in the chart expression, the script usage is limited at present, with an intention for more functionality in the future.

Anonymous
Not applicable
Author

Hi Steven,

Thanks for you're answer.

my point here is not strinctly bounded to iris dataset.

In order to create real business cases I think that being able to run scripts such as cluster analysis is basically. Usually you want to cluster when you need insights from the data much more complex than iris.

If you have to perform it on a data set with 1000000 record the graph takes a lot, and user loose the Qlik's experience of dynamic navigation.

As per my understanding, use R functions ( excluded the sum() ) in the script is not possible, as confirmed also by Qiyu.

If someone has a script, even an easy one, that wants to share it would be very helpful.

Cheers

Paolo

michal
Partner - Contributor III
Partner - Contributor III

I totally agree with Paolo.

The examples run live on the data sent to R.

It would be good to be able to run R from script on data sets also. Clustering is a good example but apriori analysis is an even better one - it takes too long to calculate dynamically.

I am also looking for a workaround, lie storing data to CSV and running R or Python in the background (it would load data from that CSV and export results to another CSV - which would be read by Qlik again).

An interesting solution can be provided with RapidMiner tool. It can be called from Qlik with simple Web Services request.

This way we call it from within the script and receive the result directly into the script. (1. store CSV, 2. call RapidMinder via Web Services, which reads CSV and does the calculation, 3. receive back the result.