Skip to main content

simCluster and simCluster+

📄️ K-Means Versus K-Spilling

simCluster+ uses K-Means as the underlying clustering approach. In general, this means that simCluster+ will return K clusters and every element in the dataset will be mapped to a cluster. However, simCluster can also do K-Spilling clustering. K-Spilling will return K clusters that are more tightly formed around their "mean" centroid, but if a datapoint is not close enough to the centroids, it will "spill" into secondary clusters. This means that K-Spilling will return K+N clusters where the first K are tightly bound and the N additional clusters contain data points that are not within the density criteria.