An empirical comparison of three audio fingerprinting methods in music and feature-length film

Authors

  • Thanh Pham Dept. of Computer Science, University of Toronto, Toronto, ON L4J 7P7, Canada
  • Matthew Giamou Dept. of Computer Science, University of Toronto, Toronto, ON L4J 7P7, Canada
  • Gerald Penn Dept. of Computer Science, University of Toronto, Toronto, ON L4J 7P7, Canada

Keywords:

Algorithms, Audio fingerprinting, Bitrates, Data sets, Empirical comparison, Music data, Vision based algorithms

Abstract

An empirical comparison of three audio fingerprinting methods in music and feature-length film is presented. Shazam, a commercially successful algorithm was chosen and against two vision-based algorithms, the original CMU algorithm, and also Google's Waveprint algorithm. The song or film of each of the queries corresponding to the respective dataset in turn is identified. The feature-length film dataset was transcoded and down- sampled from 48KHz to 16KHz mono-channel PCM with 256 kbps bitrates. The experiments were conducted on a machine with a single 3.0GHz Intel Xeon CPU with a 4MB cache and 16GB RAM. The F-measures on the two datasets show that optimizations for quality pay very high dividends on film audio, but not on music data. It is also found that the Shazam handily outperforms both vision- based algorithms, in both time and quality.

Additional Files

Published

2012-09-01

How to Cite

1.
Pham T, Giamou M, Penn G. An empirical comparison of three audio fingerprinting methods in music and feature-length film. Canadian Acoustics [Internet]. 2012 Sep. 1 [cited 2025 Feb. 13];40(3):92-3. Available from: https://jcaa.caa-aca.ca/index.php/jcaa/article/view/2555

Issue

Section

Proceedings of the Acoustics Week in Canada