An empirical comparison of three audio fingerprinting methods in music and feature-length film

Thanh Pham; Matthew Giamou; Gerald Penn

An empirical comparison of three audio fingerprinting methods in music and feature-length film

Auteurs-es

Thanh Pham Dept. of Computer Science, University of Toronto, Toronto, ON L4J 7P7, Canada
Matthew Giamou Dept. of Computer Science, University of Toronto, Toronto, ON L4J 7P7, Canada
Gerald Penn Dept. of Computer Science, University of Toronto, Toronto, ON L4J 7P7, Canada

Mots-clés :

Algorithms, Audio fingerprinting, Bitrates, Data sets, Empirical comparison, Music data, Vision based algorithms

Résumé

An empirical comparison of three audio fingerprinting methods in music and feature-length film is presented. Shazam, a commercially successful algorithm was chosen and against two vision-based algorithms, the original CMU algorithm, and also Google's Waveprint algorithm. The song or film of each of the queries corresponding to the respective dataset in turn is identified. The feature-length film dataset was transcoded and down- sampled from 48KHz to 16KHz mono-channel PCM with 256 kbps bitrates. The experiments were conducted on a machine with a single 3.0GHz Intel Xeon CPU with a 4MB cache and 16GB RAM. The F-measures on the two datasets show that optimizations for quality pay very high dividends on film audio, but not on music data. It is also found that the Shazam handily outperforms both vision- based algorithms, in both time and quality.

Fichiers supplémentaires

PDF (English)

Publié-e

2012-09-01

Comment citer

Pham T, Giamou M, Penn G. An empirical comparison of three audio fingerprinting methods in music and feature-length film. Canadian Acoustics [Internet]. 1 sept. 2012 [cité 15 févr. 2025];40(3):92-3. Disponible à: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/2555

Télécharger la référence

Numéro

Vol. 40 No. 3 (2012)

Rubrique

Actes du congrès de la Semaine canadienne d'acoustique

Licence

Author Licensing Addendum

This Licensing Addendum ("Addendum") is entered into between the undersigned Author(s) and Canadian Acoustics journal published by the Canadian Acoustical Association (hereinafter referred to as the "Publisher"). The Author(s) and the Publisher agree as follows:

Retained Rights: The Author(s) retain(s) the following rights:
- The right to reproduce, distribute, and publicly display the Work on the Author's personal website or the website of the Author's institution.
- The right to use the Work in the Author's teaching activities and presentations.
- The right to include the Work in a compilation for the Author's personal use, not for sale.
Grant of License: The Author(s) grant(s) to the Publisher a worldwide exclusive license to publish, reproduce, distribute, and display the Work in Canadian Acoustics and any other formats and media deemed appropriate by the Publisher.
Attribution: The Publisher agrees to include proper attribution to the Author(s) in all publications and reproductions of the Work.
No Conflict: This Addendum is intended to be in harmony with, and not in conflict with, the terms and conditions of the original agreement entered into between the Author(s) and the Publisher.
Copyright Clause: Copyright on articles is held by the Author(s). The corresponding Author has the right to grant on behalf of all Authors and does grant on behalf of all Authors, a worldwide exclusive license to the Publisher and its licensees in perpetuity, in all forms, formats, and media (whether known now or created in the future), including but not limited to the rights to publish, reproduce, distribute, display, store, translate, create adaptations, reprints, include within collections, and create summaries, extracts, and/or abstracts of the Contribution.

An empirical comparison of three audio fingerprinting methods in music and feature-length film

Auteurs-es

Mots-clés :

Résumé

Fichiers supplémentaires

Publié-e

Comment citer

Numéro

Rubrique

Licence

Langue

Abonnement

Faire une soumission

Renseignements