Clustering

Last revised by Candace Makeda Moore on 9 May 2024

Citation, DOI, disclosures and article data

Citation:

Moore C, Murphy A, Weerakkody Y, Clustering. Reference article, Radiopaedia.org (Accessed on 17 May 2024) https://doi.org/10.53347/rID-70687

DOI:

https://doi.org/10.53347/rID-70687

Permalink:

https://radiopaedia.org/articles/70687

rID:

70687

Article created:

31 Aug 2019, Candace Makeda Moore

Disclosures:

At the time the article was created Candace Makeda Moore had no recorded disclosures.

View Candace Makeda Moore's current disclosures

Last revised:

9 May 2024, Candace Makeda Moore

Disclosures:

At the time the article was last revised Candace Makeda Moore had no financial relationships to ineligible companies to disclose.

View Candace Makeda Moore's current disclosures

Revisions:

4 times, by 3 contributors - see full revision history and disclosures

Sections:

Artificial Intelligence

Tags:

machine learning

Clustering, also known as cluster analysis, is a machine learning technique designed to group similar data points together. Since the data points do not necessarily have to be labeled, clustering is an example of unsupervised learning. Clustering in machine learning should not be confused with discovering clusters in epidemiology.

There are many algorithms that have been developed to achieve clustering, and the effectiveness of each is largely dependent on the size of the dataset and the distribution of data points. Some of the more commonly used groups of algorithms for clustering in radiology, which have been in use for decades for the task of segmentation, include Fuzzy C mean clustering and K means clustering ^1,2. One of most popular types of algorithms for clustering is K-means, which seeks to group a dataset into K number of clusters. An example of a more advanced algorithm is Density-Based Spatial Clustering of Applications with Noise (DBSCAN), which is more effective for data distributed in a non-guassian manner.

In radiology (as well as pathology), clustering groups data, which may correspond to sets of pixels or voxels within images, whole images, reports or patients, by similarities in terms of various attributes or features without being explicitly programmed about final labels to group by. Thus clustering has the potential to reveal similarities in data overlooked by humans.

Practically speaking, clustering has proven useful in segmentation algorithms for radiology, which are used to identify different tissue types and/or differentiate pathological and normal tissue. However clustering algorithms are researched in other areas such as natural language processing of reports ³.