A comparison of pitch extraction methodologies for dolphin vocalization

Xanadu C. Halkias; Daniel P. W. Ellis

A comparison of pitch extraction methodologies for dolphin vocalization

Authors

Xanadu C. Halkias LabRosa, Columbia University, Department of Electrical Engineering, 1300 S.W. Mudd, 500 West 120th Street, New York, NY 10027
Daniel P. W. Ellis LabRosa, Columbia University, Department of Electrical Engineering, 1300 S.W. Mudd, 500 West 120th Street, New York, NY 10027

Keywords:

Amplitude modulation, Hidden Markov models, Mammals, Markov processes, Mathematical programming, Signal to noise ratio, Signaling, Systems engineering, Wavelet transforms, Frequency ranging, Pitch extraction

Abstract

When collecting and analyzing marine mammal vocalizations one of the most important goals is to automatically extract the pitch/fundamental frequency of the collected calls. In dolphins we can assume that there are two main pitched sounds: whistles, which can be described as tonal AM-FM signals, and bursts, which can be described as highly harmonic signals. There are three main difficulties with pitch extraction on dolphin vocalizations that arise from the nature of the data. First, most underwater recordings are restricted to a low signal-to-noise ratio due to reflections, hardware noise and other interferences. This constitutes a big challenge for most existing pitch trackers. Second, one has to take into account the significant differences in the frequency range of bottlenose dolphin vocalizations compared to humans. Finally, dolphin whistles and bursts generally are emitted in two distinct frequency ranges, which result in different modes in the analysis data. In this work we compare our novel pitch extraction approach with two widely popular algorithms. Our approach uses hierarchy-based hidden Markov models (HMM) with cepstral coefficients as features. We quantitatively compare the performance of our algorithm with Yin, which is based on a modified autocorrelation method and get_f0, a popular off-the-shelf pitch tracker that utilizes linear predictive coefficients (LPC) and dynamic programming. Our approach outperforms the comparative methods by at least a factor of 10%.

Additional Files

Published

2008-03-01

How to Cite

Halkias XC, Ellis DPW. A comparison of pitch extraction methodologies for dolphin vocalization. Canadian Acoustics [Internet]. 2008 Mar. 1 [cited 2025 Feb. 19];36(1):74-80. Available from: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/1994

Download Citation

Issue

Vol. 36 No. 1 (2008)

Section

Proceedings of the Acoustics Week in Canada

License

Author Licensing Addendum

This Licensing Addendum ("Addendum") is entered into between the undersigned Author(s) and Canadian Acoustics journal published by the Canadian Acoustical Association (hereinafter referred to as the "Publisher"). The Author(s) and the Publisher agree as follows:

Retained Rights: The Author(s) retain(s) the following rights:
- The right to reproduce, distribute, and publicly display the Work on the Author's personal website or the website of the Author's institution.
- The right to use the Work in the Author's teaching activities and presentations.
- The right to include the Work in a compilation for the Author's personal use, not for sale.
Grant of License: The Author(s) grant(s) to the Publisher a worldwide exclusive license to publish, reproduce, distribute, and display the Work in Canadian Acoustics and any other formats and media deemed appropriate by the Publisher.
Attribution: The Publisher agrees to include proper attribution to the Author(s) in all publications and reproductions of the Work.
No Conflict: This Addendum is intended to be in harmony with, and not in conflict with, the terms and conditions of the original agreement entered into between the Author(s) and the Publisher.
Copyright Clause: Copyright on articles is held by the Author(s). The corresponding Author has the right to grant on behalf of all Authors and does grant on behalf of all Authors, a worldwide exclusive license to the Publisher and its licensees in perpetuity, in all forms, formats, and media (whether known now or created in the future), including but not limited to the rights to publish, reproduce, distribute, display, store, translate, create adaptations, reprints, include within collections, and create summaries, extracts, and/or abstracts of the Contribution.

A comparison of pitch extraction methodologies for dolphin vocalization

Authors

Keywords:

Abstract

Additional Files

Published

How to Cite

Issue

Section

License

Language

Subscription

Make a Submission

Information