Enhancing Noisy Speech by Recursive Identification of Multiple
Vocal Tract Models
M.Niranjan and W.J.Fitzgerald
Cambridge University Engineering Department
Cambridge CB2 1PZ
In our contribution to MaxEnt92 , we discussed
Bayesian parameter estimation and model identification for a speech
production model known as the Formant-Bandwidth model. This model is
widely used in speech processing, in particular speech synthesis.
Parameter estimation for such a Formant model is done recursively
(i.e. sample by sample) by means of an extended Kalman filter
algorithm . It is possible to deal with the nonstationarity of
speech signals, as the vocal tract moves, by running multiple Formant
models in parallel [3,4]. The innovation probability in the Kalman
filter may be interpreted as the Bayesian evidence for each model and
used to calculate a model likelihood recursively.
In this paper, we extend the above work to an application of enhancing
speech corrupted by additive noise. In this approach, multiple models
of speech are recursively estimated in parallel. The models differ in
the number of Formants in the speech and the noise covariances of the
dynamical system. An estimate of the speech sample value and the model
likelihood are evaluated for each model. The likelihoods are used to
output a weighted sum of the predicted speech samples as the clean
At the meeting, we will play a tape demonstrating the enhancement that
can be achieved in this manner.
 Fitzgerald and Niranjan; ``Speech Processing using Bayesian Inference'',
Proc. 12th International MaxEnt Workshop, Paris, July 1992.
 G.Rigoll; ``A new algorithm for estimation of formant trajectories
directly from the speech signal based on an extended Kalman filter'';
Proceedings of the ICASSP 1986, pp. 1229-1232.
 M.Niranjan, I.Cox and S.Hingorani; ``Recursive tracking of formants in
speech signals'', Proc ICASSP 1994, Adelaide.
 Y.Bar-Shalom and T.E.Fortmann, Tracking and Data Association,
Prentice Hall, 1988.
MaxEnt 94 Abstracts / email@example.com