Identifying the Unknown With Clustering Metrics

Sep 7, 2022, Software Pundits

This post was originally published on this site

Toptal

Clustering is an unsupervised machine learning method to divide given data into groups based solely on the features of each sample. Sorting data into clusters can help identify unknown similarities between samples or reveal outliers in the data set. In the real world, clustering has significance across diverse fields from marketing to biology: Clustering applications include market segmentation, social network analysis, and diagnostic medical imaging.

Because this process is unsupervised, multiple clustering results can form around different features. For example, imagine you have a data set composed of various images of red trousers, black trousers, red shirts, and black shirts. One algorithm might find clusters based on clothing shape, while another might create groups based on color.

When analyzing a data set, we need a way to accurately measure the performance of different clustering algorithms; we may want to contrast the solutions of two algorithms, or see how close a

To read the full article click on the 'post' link at the top.

Date

Sep 7, 2022

Welcome

Written By Toptal

You May Also Like

More Posts From: Software Pundits