Tag: clustering rss


06 August 2019 / / Algorithms / Data Mining
In parts #1 and #2 of the “Outliers Detection in PySpark” series, I talked about Anomaly Detection, Outliers Detection and the interquartile range (boxplot) method. In this third and last part, I will talk about how one can use the popular K-means clustering algorithm to detect outliers. K-means K-means is one of the easiest and most popular unsupervised algorithms in Machine Learning for Clustering.