Speech Recognition Experiments with a Cochlear Model

Authors

  • Richard F. Lyon

Abstract

There are several ways that a computational model or auditory processing in the cochlea can be applied as the front end or a speech recognition system. For an initial round or experimentation, the fine time structure in the model's output has been used to do spectral sharpening, yielding a "cochleagram" representation analogous to a short-time spectral representation. In later experiments, fine time structure will be exploited for a more detailed characterization of sounds, and for sound separation. So far, experiments have been done with only two words ("one" and "nine") spoken by 112 talkers, to limit the range of phonetic variation to simple voiced sounds, while providing a good sample of inter-speaker variation. The structure of the vector space of "auditory spectra" has been examined through vector quantization experiments, which yield a measure of information content and local dimensionality.

The inclusion of more dimensions of perceptual variation, such as pitch and loudness, in a speech front end representation is both an opportunity and a problem. Much larger vector quantization codebooks and more training data may be needed to take advantage of the extra information dimensions. A product-code approach and an improved algorithm for finding the nearest neighbor codeword are suggested to help cope with the problem and take advantage of the opportunity.

Preliminary recognition experiments using a single codebook per word and no time sequence information have shown a performance of about 97% correct one/nine discrimination for talkers outside the training set, and 100% correct for second repetitions from talkers in the training set. Further experiments are currently underway.

Downloads

Published

2022-12-03

How to Cite

1.
Lyon RF. Speech Recognition Experiments with a Cochlear Model . Canadian Acoustics [Internet]. 2022 Dec. 3 [cited 2023 Feb. 8];14(3 bis):10-1. Available from: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/3498

Issue

Section

Proceedings of the Acoustics Week in Canada