# Massive Inference

## Sibusiso Sibisi

University of Cambridge, Cavendish Laboratory

Madingley Road, England CB3 0HE

### Abstract

Bayesian estimation of a non-negative density **\phi(x)**
over continuous **x** is discussed by John Skilling in
these proceedings. The analysis uses a finite discretization on
**x**, subject to the requirement that different
discretizations must be consistent if they are to be representations
of the same continuum limit problem. This condition effectively leads
to a gamma process `prior', which becomes a Dirichlet process if unit
normalization on **\phi** is imposed. Here we focus on
problems where such normalization holds, with density estimation as
our exemplar.
The Dirichlet process has a discrete representation: to any desired
accuracy, a sample from the process consists of a finite number of
isolated point masses even in the continuum limit. This evades the
Law of Large Numbers which would otherwise lead to flat, zero-variance
estimates. Because of this property, we term the use of such a
process *Massive Inference*.

In density estimation, we are given a set of *iid*observations
drawn from an unknown continuous density **f** from which
we wish to estimate **f**. In this problem, as in many
others, **f** is presumed to be smooth. We incorporate
smoothness through the integral formulation

By construction, **\phi** is a latent density containing
no spatial smoothness---the appropriate smoothness being delegated to
the kernel function **K**, which is assigned at least to
within a few shape parameters. We use a Dirichlet process prior on
**\phi** so that the above integral effectively becomes a
finite discrete sum over the point masses of **\phi**.

We present numerical investigations of textbook data in one and two
dimensions.

MaxEnt 94 Abstracts / mas@mrao.cam.ac.uk