There are many different clustering algorithms to cluster, or group, objects based on how similar (or close in terms of distance) their attributes are.
We will look at just one type of clustering, K-Means clustering, but many other types exist. You can read more about other methods of clustering here.
K-Means clustering is an unsupervised learning technique used in processes such as market segmentation, document clustering, image segmentation and image compression.
Usually we do K-Means clustering to:
If we think that subgroup behaviours differ substantially, then we will get more accurate models by making separate models for each subgroup, than one model for all groups.
This tutorial is not compulsory, but you can go through it on your own for a gentle introduction to clustering. It is easier than the clustering assignment given in Projects.
Data: Iris species
Use K-Means cluster analysis to cluster different iris species. Make an elbow plot and/or use silhouette analysis to find the optimal number of clusters.
What are the factors that differ between different iris species?
Create a plot of the clusters.