[Statlist] Next Talk: Friday, May 21, 2010 with Christian Hennig, University College London
Cecilia Rey
rey at stat.math.ethz.ch
Mon May 17 09:06:11 CEST 2010
ETH and University of Zurich
Proff. P. Buehlmann - L. Held -
H.R. Kuensch - M. Maathuis - S. van de Geer
*********************************************************
We are glad to announce the following talk
*Friday, May 21, 2010 15.15 - 17.00 HG G 19.1 *
***********************************************************
with * Christian Hennig*, University College London
/Title: /
How to merge normal mixture components for cluster analysis
/Abstract: /
Normal mixture models are often used for cluster analysis. Usually,
every component of the mixture is interpreted as a cluster. This,
however, is often not appropriate. A mixture of two normal components
can be unimodal and quite homogeneous. Particularly, mixtures of several
normals can be needed to approximate homogeneous non-normal distributions.
Even if there are non-normal subpopulations in the data, the normal
mixture model is still a good tool for clustering because of its
flexibility. This presentation is about methods to decide whether, after
having fitted a normal mixture, several mixture components should be
merged in order to be interpreted as a single cluster.
Note that this cannot be formulated as a statistical estimation problem,
because the likelihood and the general fitting quality of the model does
not depend on whether single mixture components or sets of mixture
components are interpreted as clusters. So any method depends on a
specification of what the user wants to regard as a "cluster". There are
at least two different cluster concepts, namely identifying clusters
with modes (and therefore merging unimodal mixtures) and identifying
clusters with clear patterns in the data
(which for example means that scale mixtures, though unimodal, should
not necessarily be merged). Furthermore, it has to be specified how
strong a separation is required between different clusters.
The methods proposed and compared in this presentation are all
hierarchical. From an estimated mixture, pairs of components (and later
pairs of already merged mixtures) are merged until members of a pair are
separated enough in order to be interpreted as different clusters. This
can be measured in many different ways, depending on the underlying
cluster concept.
Apart from the discussed methodology, some implications about how to
think about cluster analysis problems in general will be discussed.
This abstract is also to be found under the following link:
http://stat.ethz.ch/talks/research_seminar
--
ETH Zürich
Seminar für Statistik
Cecilia Rey-Lutz, HG G10.3
Rämistrasse 101
CH-8092 Zurich
mail: rey at stat.math.ethz.ch
phone: +41 44 632 3438/fax: +41 44 632 1228
More information about the Statlist
mailing list