Skip to main content

Qlik Sense Documents

Qlik Sense documentation and resources.

Announcements
Welcome to Qlik Community! Check out our new navigation! FIND OUT MORE

How to Design K-means Clustering in Qlik Sense with help of Microsoft R(R Integration with Qlik Sense)

cancel
Showing results for 
Search instead for 
Did you mean: 
rohitk1609
Master
Master

How to Design K-means Clustering in Qlik Sense with help of Microsoft R(R Integration with Qlik Sense)

Attachments

This is the third document in chain of How to Design Statistical analysis after How to Design Multiple Linear Regression in Qlik Sense with help of Microsoft R(R Integration with Q... 

Setup and assumptions are already covered in above mentioned document.

Lets discuss K- means Algorithm. 

K-means algorithm allows to cluster the data, discovery the categories in data which is not easy to find.

In a very simple example:

rohitk1609_0-1625338766918.png

 

Important points:

  1. Choose the number of clusters.
  2. Scaling data is needed when x and y dimensions are not much related to each other, say, shoe size and weight. It has different units attached (lb, tons, m, kg ...) then these values aren't really comparable anyway; z-standardizing or scaling them is a best-practise to give equal weight to them. You don’t need scaling if data is based on longitude and latitude. If you have binary values, discrete attributes or categorical attributes, stay away from k-means. K-means needs to compute means, and the mean value is not meaningful on this kind of data. It controls the variability of the dataset, it convert data into specific range using a linear transformation which generate good quality clusters and improve the accuracy of clustering algorithms, check out the link below to view its effects on k-means analysis.

Lets design K means cluster algorithm chart in Qlik Sense which is integrated with R engine.

Data is attached on which we are going to create the visual:

Go to Qlik Sense=> Create App=> Sheet=> Drag and drop Advance analytics extension:

 

rohitk1609_1-1625339190178.png

 

Select k-meanse clustering:

rohitk1609_2-1625339228190.png

 

Select Dimension as Product Name, X axis= Sales and Y axis= Quantity

rohitk1609_3-1625339498382.png

 

You can increase the cluster numbers by updating the setting:

rohitk1609_4-1625339563748.png

 

 

You can scale the data if needed as we discussed above:

rohitk1609_5-1625339621846.png

 

All visuals are from Qlik Sense but calculated in R.

Rohit's Introduction  

Reach out to me at kumar.rohit1609@gmail.com if there is need of any clarification or need assistance 

Connect with me on LinkedIn  https://in.linkedin.com/pub/rohit-kumar/2b/a15/67b 

To get latest updates and articles, join my Facebook page  https://www.facebook.com/QlikIntellectuals

When applicable please mark the appropriate replies as ACCEPT AS SOLUTION and LIKE it. This will help community members and Qlik Employees know which discussions have already been addressed and have a possible known solution. Please mark threads as LIKE if the provided solution is helpful to the problem, but does not necessarily solve the indicated problem. You can mark multiple threads as LIKE if you feel additional info is useful to others.

Labels (1)
Version history
Last update:
‎2021-07-04 01:22 PM
Updated by:
Contributors