You are currently viewing the abstract.
View Full TextLog in to view the full text
AAAS login provides access to Science for AAAS members, and access to other journals in the Science family to users who have purchased individual subscriptions.
Register for free to read this article
As a service to the community, this article is available for free. Existing users log in.
More options
Download and print this article for your personal scholarly, research, and educational use.
Buy a single issue of Science for just $15 USD.
Discerning clusters of data points
Cluster analysis is used in many disciplines to group objects according to a defined measure of distance. Numerous algorithms exist, some based on the analysis of the local density of data points, and others on predefined probability distributions. Rodriguez and Laio devised a method in which the cluster centers are recognized as local density maxima that are far away from any points of higher density. The algorithm depends only on the relative densities rather than their absolute values. The authors tested the method on a series of data sets, and its performance compared favorably to that of established techniques.
Science, this issue p. 1492
Abstract
Cluster analysis is aimed at classifying elements into categories on the basis of their similarity. Its applications range from astronomy to bioinformatics, bibliometrics, and pattern recognition. We propose an approach based on the idea that cluster centers are characterized by a higher density than their neighbors and by a relatively large distance from points with higher densities. This idea forms the basis of a clustering procedure in which the number of clusters arises intuitively, outliers are automatically spotted and excluded from the analysis, and clusters are recognized regardless of their shape and of the dimensionality of the space in which they are embedded. We demonstrate the power of the algorithm on several test cases.











