Access the full text.
Sign up today, get DeepDyve free for 14 days.
M. Timmerman, H. Kiers, A. Smilde, E. Ceulemans, J. Stouten (2009)
Bootstrap confidence intervals in multi-level simultaneous component analysis.The British journal of mathematical and statistical psychology, 62 Pt 2
S. Dudoit, J. Fridlyand (2003)
Bagging to Improve the Accuracy of A Clustering ProcedureBioinformatics, 19 9
J. Munkres (1957)
Algorithms for the Assignment and Transportation ProblemsJournal of The Society for Industrial and Applied Mathematics, 10
D. Hand, W. Krzanowski (2005)
Short communication: Optimising k-means clustering results with standard software packagesComputational Statistics & Data Analysis, 49
H. Kiers (2004)
Bootstrap confidence intervals for three‐way methodsJournal of Chemometrics, 18
M. Timmerman, H. Kiers, A. Smilde (2007)
Estimating confidence intervals for principal component loadings: a comparison between the bootstrap and asymptotic results.The British journal of mathematical and statistical psychology, 60 Pt 2
D. Steinley, M. Brusco (2011)
Evaluating mixture modeling for clustering: recommendations and cautions.Psychological methods, 16 1
D. Steinley (2003)
Local optima in K-means clustering: what you don't know may hurt you.Psychological methods, 8 3
G. Sebestyen (1962)
Decision-making processes in pattern recognition
M. Cugmas, A. Ferligoj (2015)
On comparing partitionsInternational Federation of Classification Societies
U. Möller, Dörte Radke (2006)
Performance of data resampling methods for robust class discovery based on clusteringIntell. Data Anal., 10
J. Kogan (2007)
Introduction to Clustering Large and High-Dimensional Data
(2000)
Finite Mixture Models, New York: Wiley
D. Steinley (2006)
K-means clustering: a half-century synthesis.The British journal of mathematical and statistical psychology, 59 Pt 1
Peter Bryant, J. Williamson (1978)
Asymptotic behaviour of classification maximum likelihood estimatesBiometrika, 65
M. Kenward (2007)
An Introduction to the Bootstrap
J. MacQueen (1967)
Some methods for classification and analysis of multivariate observations, 1
D. Steinley (2004)
Properties of the Hubert-Arabie adjusted Rand index.Psychological methods, 9 3
C. Hennig (2007)
Cluster-wise assessment of cluster stabilityComput. Stat. Data Anal., 52
S. Monti, P. Tamayo, J. Mesirov, T. Golub (2003)
Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray DataMachine Learning, 52
M. Linting, J. Meulman, P. Groenen, A. Kooij (2007)
Stability of nonlinear principal components analysis: an empirical study using the balanced bootstrap.Psychological methods, 12 3
S. Seiler (2016)
Finding Groups In Data
Anil Jain (2008)
Data clustering: 50 years beyond K-meansPattern Recognit. Lett., 31
G. Milligan (1985)
An algorithm for generating artificial test clustersPsychometrika, 50
D. Hinkley (2008)
Bootstrap Methods: Another Look at the Jackknife
D. Pollard (1982)
A Central Limit Theorem for $k$-Means ClusteringAnnals of Probability, 10
M. Jhun (1990)
BOOTSTRAPPING K -MEANS CLUSTERINGJournal of the Japanese Society of Computational Statistics, 3
D. Steinley, M. Brusco (2007)
Initializing K-means Batch Clustering: A Critical Evaluation of Several TechniquesJournal of Classification, 24
A. Jasra, C. Holmes, D. Stephens (2005)
Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture ModelingStatistical Science, 20
Anil Jain, J. Moreau (1987)
Bootstrap technique in cluster analysisPattern Recognit., 20
W. Krzanowski (1989)
On confidence regions in canonical variate analysisBiometrika, 76
Debashis Kushary (2000)
Bootstrap Methods and Their ApplicationTechnometrics, 42
D. Steinley (2008)
Stability analysis in K-means clustering.The British journal of mathematical and statistical psychology, 61 Pt 2
R. Maitra, Volodymyr Melnykov, S. Lahiri (2012)
Bootstrapping for Significance of Compact Clusters in Multidimensional DatasetsJournal of the American Statistical Association, 107
Because of its deterministic nature, K-means does not yield confidence information about centroids and estimated cluster memberships, although this could be useful for inferential purposes. In this paper we propose to arrive at such information by means of a non-parametric bootstrap procedure, the performance of which is tested in an extensive simulation study. Results show that the coverage of hyper-ellipsoid bootstrap confidence regions for the centroids is in general close to the nominal coverage probability. For the cluster memberships, we found that probabilistic membership information derived from the bootstrap analysis can be used to improve the cluster assignment of individual objects, albeit only in the case of a very large number of clusters. However, in the case of smaller numbers of clusters, the probabilistic membership information still appeared to be useful as it indicates for which objects the cluster assignment resulting from the analysis of the original data is likely to be correct; hence, this information can be used to construct a partial clustering in which the latter objects only are assigned to clusters.
Journal of Classification – Springer Journals
Published: Jul 8, 2015
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.