The Auditory Processing of Speech

Shihab A. Shamma

The Auditory Processing of Speech

Authors

Shihab A. Shamma

Abstract

The processing or speech in the mammalian auditory periphery Is discussed in terms or the spatio-temporal nature or the distribution or the cochlear response and the novel encoding schemes this permits. Algorithms to detect specific morphological features or the response patterns are also considered for the extraction or stimulus spectral parameters.

The remarkable abilities or the human auditory system to detect, separate, and recognize speech and environmental sounds has been the subject or extensive physiological and psychological research for several decades. The results of this research have strongly Influenced developments in various fields ranging from auditory prostheses to the encoding, analysis, and automatic recognition of speech. In recent years, improved experimental techniques have precipitated major advances in our understanding of sound processing in the auditory periphery. Most important among these is the Introduction of nerve-fiber population recordings which made possible the reconstruction of both the temporal and spatial distribution of activity on the auditory-nerve In response to acoustic stimuli [1, 2]. Sachs et al. utilized such data to demonstrate the existence of a highly accurate temporal structure that is capable of providing a faithful and robust representation of speech spectra over a wide dynamic range and under relatively low signal-to-noise conditions [3, 4]. Their work has since motivated further research into the various algorithms that the central nervous system (CNS) might employ to detect and extract these and other response features, and the possible neural structures that underly them [5, 6].

In pursuit or these goals, we have constructed and analyzed the spatio-temporal response patterns of cat's auditory-nerve to synthesized speech sounds [14, 5]. These patterns are formed by spatially organizing the temporal response waveforms (or PST histograms) or the auditory-nerve-fibers according to their characteristic frequency (CF) [4]. The resulting display highlights the interplay of temporal and spatial cues across the fiber array and suggest novel ways or viewing cochlear processing and encoding of complex sounds [7. 5]. The availability of such experimental data, however, is at present limited by technical constraints and the massive amount of processing required to handle them. Thus, in order to analyze new speech tokens, and to facilitate the necessary manipulation of stimulus and/or processing conditions and parameters, we have developed detailed biophysical and computational models or the auditory periphery and used them to generate spatio-temporal response patterns to natural and synthesized speech stimuli. Various CNS schemes for the estimation or stimulus spectral parameters are then Investigated based on these patterns.

Additional Files

Published

2022-12-03

How to Cite

Shamma SA. The Auditory Processing of Speech. Canadian Acoustics [Internet]. 2022 Dec. 3 [cited 2025 Feb. 22];14(3 bis):14-5. Available from: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/3500

Download Citation

Issue

Vol. 14 No. 3 bis (1986): Montreal Symposium On Speech Recognition

Section

Proceedings of the Acoustics Week in Canada

License

Author Licensing Addendum

This Licensing Addendum ("Addendum") is entered into between the undersigned Author(s) and Canadian Acoustics journal published by the Canadian Acoustical Association (hereinafter referred to as the "Publisher"). The Author(s) and the Publisher agree as follows:

Retained Rights: The Author(s) retain(s) the following rights:
- The right to reproduce, distribute, and publicly display the Work on the Author's personal website or the website of the Author's institution.
- The right to use the Work in the Author's teaching activities and presentations.
- The right to include the Work in a compilation for the Author's personal use, not for sale.
Grant of License: The Author(s) grant(s) to the Publisher a worldwide exclusive license to publish, reproduce, distribute, and display the Work in Canadian Acoustics and any other formats and media deemed appropriate by the Publisher.
Attribution: The Publisher agrees to include proper attribution to the Author(s) in all publications and reproductions of the Work.
No Conflict: This Addendum is intended to be in harmony with, and not in conflict with, the terms and conditions of the original agreement entered into between the Author(s) and the Publisher.
Copyright Clause: Copyright on articles is held by the Author(s). The corresponding Author has the right to grant on behalf of all Authors and does grant on behalf of all Authors, a worldwide exclusive license to the Publisher and its licensees in perpetuity, in all forms, formats, and media (whether known now or created in the future), including but not limited to the rights to publish, reproduce, distribute, display, store, translate, create adaptations, reprints, include within collections, and create summaries, extracts, and/or abstracts of the Contribution.

The Auditory Processing of Speech

Authors

Abstract

Additional Files

Published

How to Cite

Issue

Section

License

Language

Subscription

Make a Submission

Information