[Statlist] Seminar ueber Statistik

Wed Apr 12 10:23:05 CEST 2006

              ETH and University of Zurich 

                           Proff. 
         A.D. Barbour - P. Buehlmann - F. Hampel 
              H.R. Kuensch - S. van de Geer

***********************************************************    
    We are pleased to announce the following talks
***********************************************************

April 20, Thursday,  16.15 h,  LEO C 15

Introduction to the modern Minimum Description Length Principle
Peter Grünwald,  CWI, Amsterdam/EURANDOM, Eindhoven  

The Minimum Description Length (MDL) Principle is an  
information-theoretic method for statistical inference, in particular 
model selection.  In recent years, particularly since  1995, 
researchers have made significant theoretical advances concerning MDL. 
In this talk we aim to present these results to a  wider audience. In 
its modern guise, MDL is based on the information-theoretic concept of 
a  `universal model'. We explain this concept at length. We show that  
previous versions of MDL (based on so-called two-part codes), Bayesian 
model selection and predictive validation (a form of cross-validation) 
 can all be interpreted as approximations to model selection based on 
`universal models'. In a model selection context, MDL prescribes the 
use of a minimax optimal universal model, the so-called `normalized 
maximum likelihood model' or `Shtarkov distribution'. We also discuss 
nonparametric forms of MDL and their asymptotic behaviour in terms of 
convergence rate measured in Kullback-Leibler and/or Hellinger risk. 
We present a theorem of Barron which directly connects 'good' 
universal models with good rates of convergence.

***********************************************************

April 21, Friday, 15.15 h,   LEO C 15

Inconsistency of Bayes and MDL under Misspecification        
Peter Grünwald,  CWI, Amsterdam/EURANDOM, Eindhoven  

We show that Bayesian and MDL inference can be statistically
inconsistent under misspecification: for any a > 0, there exists a
distribution P, a set of distributions (model) M, and a 'reasonable'
prior on M such that
(a) P is not in M (the model is wrong)
(b) There is a distribution P' in M with KL-divergence D(P,P') = a
yet, if data are i.i.d. according to P, then the Bayesian posterior
concentrates on an (ever-changing) set of distributions that all have
KL-divergence to P much larger than a. If the posterior is used for
classification purposes, it can even perform worse than random
guessing.

The result is fundamentally different from existing Bayesian
inconsistency results due to Diaconis, Freedman and Barron, in that we
can choose the model M to be only countably large; if M were
well-specified (`true'), then by Doob's theorem this would immediately
imply consistency.

Joint work with John Langford of the Toyota Technological Institute,
Chicago.

***********************************************************
________________________________________________________
Christina Kuenzli            <kuenzli at stat.math.ethz.ch>
Seminar fuer Statistik      
Leonhardstr. 27,  LEO D11      phone: +41 (0)44 632 3438         
ETH-Zentrum,                   fax  : +41 (0)44 632 1228 
CH-8092 Zurich, Switzerland        http://stat.ethz.ch/~