Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Cluster Validation for Mixtures of Regressions via the Total Sum of Squares Decomposition

Cluster Validation for Mixtures of Regressions via the Total Sum of Squares Decomposition One of the challenges in cluster analysis is the evaluation of the obtained clustering results without using auxiliary information. To this end, a common approach is to use internal valid- ity criteria. For mixtures of linear regressions whose parameters are estimated by maximum likelihood, we propose a three-term decomposition of the total sum of squares as a starting point to define some internal validity criteria. In particular, three types of mixtures of regres- sions are considered: with fixed covariates, with concomitant variables, and with random covariates. A ternary diagram is also suggested for easier joint interpretation of the three terms of the proposed decomposition. Furthermore, local and overall coefficients of deter- mination are respectively defined to judge how well the model fits the data group-by-group but also taken as a whole. Artificial data are considered to find out more about the proposed decomposition, including violations of the model assumptions. Finally, an application to real data illustrates the use and the usefulness of these proposals. Keywords Cluster validation · EM algorithm · Maximum likelihood · Mixtures of regressions · Model-based clustering · Ternary diagram 1 Introduction The decomposition of the total sum of squares (total variation) in the explained sum http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Classification Springer Journals

Cluster Validation for Mixtures of Regressions via the Total Sum of Squares Decomposition

Journal of Classification , Volume OnlineFirst – Jul 16, 2019

Loading next page...
 
/lp/springer-journals/cluster-validation-for-mixtures-of-regressions-via-the-total-sum-of-JFe20OfJUa

References (67)

Publisher
Springer Journals
Copyright
Copyright © 2019 by The Classification Society
Subject
Statistics; Statistical Theory and Methods; Pattern Recognition; Bioinformatics; Signal,Image and Speech Processing; Psychometrics; Marketing
ISSN
0176-4268
eISSN
1432-1343
DOI
10.1007/s00357-019-09326-4
Publisher site
See Article on Publisher Site

Abstract

One of the challenges in cluster analysis is the evaluation of the obtained clustering results without using auxiliary information. To this end, a common approach is to use internal valid- ity criteria. For mixtures of linear regressions whose parameters are estimated by maximum likelihood, we propose a three-term decomposition of the total sum of squares as a starting point to define some internal validity criteria. In particular, three types of mixtures of regres- sions are considered: with fixed covariates, with concomitant variables, and with random covariates. A ternary diagram is also suggested for easier joint interpretation of the three terms of the proposed decomposition. Furthermore, local and overall coefficients of deter- mination are respectively defined to judge how well the model fits the data group-by-group but also taken as a whole. Artificial data are considered to find out more about the proposed decomposition, including violations of the model assumptions. Finally, an application to real data illustrates the use and the usefulness of these proposals. Keywords Cluster validation · EM algorithm · Maximum likelihood · Mixtures of regressions · Model-based clustering · Ternary diagram 1 Introduction The decomposition of the total sum of squares (total variation) in the explained sum

Journal

Journal of ClassificationSpringer Journals

Published: Jul 16, 2019

There are no references for this article.