# Function estimation: MaxEnt and beyond

## John Skilling

University of Cambridge, Cavendish Laboratory

Madingley Road, England CB3 0HE

### Abstract

A quantity of interest is distributed as **\phi(x)**,
measured by data **D** which constrain integral
properties of it: what is **\phi**? Although the context
shows **\phi** to be a non-negative density, it is in
general nonparametric (or freeform) in the sense that it may not be
assumed to admit a reasonable parametrization in terms of a limited
number of parameters. One can either seek to assign a single optimal
**\widehat \phi**, or to construct the posterior
probability **\Pr(\phi|D)** which defines the range of
plausible results. The prior and posterior probabilities
**\Pr(\phi)** and **\Pr(\phi|D)** have to be
defined over the very large space of all non-negative functions
**\phi** rather than some parameter space of limited
dimension.
Maximum entropy is the proper way of assigning a single
density consistently with integral constraints.

Bayesian analysis, though, repeatedly requires integration over
**\phi**-space, thus demanding that the space be endowed
with an integration measure. For analytic and computational
tractability we use a finite discretization, partitioning the relevant
domain of **x** into **M** cells. Since the
underlying problem is continuous, the measure on **M**
cells must be set up in a manner consistent with passage to the
continuum limit (**M \rightarrow \infty)** of continuous
**x**. We derive the form of this condition, which leads
naturally to an explicit power-law form for the integration measure
per unit volume. The specification of this measure logically precedes
the assignment of any probability functions, which appear as pointwise
weighting factors in integrals over **\phi**-measure.

Quantified maximum entropy used the entropy **S(\phi)**
to assign the prior probability required by Bayesian analysis as
**\exp(\alpha S)**, but this leads to an
incompatibility with the continuum limit.

Instead, the prior **\exp(-\beta \phi)** relative to
the power-law measure is the natural choice, having several desirable
properties. The product of this prior and measure is commonly known
as the gamma process. The further constraint of unit normalization on
**\phi** immediately leads to the Dirichlet process. The
use of such processes in Bayesian function estimation is effectively
demanded by a consistency argument, regardless of the form of data to
be analyzed. Specific applications are discussed by Sibusiso Sibisi
in these proceedings.

MaxEnt 94 Abstracts / mas@mrao.cam.ac.uk