An empirical comparison of three audio fingerprinting methods in music and feature-length film
Keywords:Algorithms, Audio fingerprinting, Bitrates, Data sets, Empirical comparison, Music data, Vision based algorithms
AbstractAn empirical comparison of three audio fingerprinting methods in music and feature-length film is presented. Shazam, a commercially successful algorithm was chosen and against two vision-based algorithms, the original CMU algorithm, and also Google's Waveprint algorithm. The song or film of each of the queries corresponding to the respective dataset in turn is identified. The feature-length film dataset was transcoded and down- sampled from 48KHz to 16KHz mono-channel PCM with 256 kbps bitrates. The experiments were conducted on a machine with a single 3.0GHz Intel Xeon CPU with a 4MB cache and 16GB RAM. The F-measures on the two datasets show that optimizations for quality pay very high dividends on film audio, but not on music data. It is also found that the Shazam handily outperforms both vision- based algorithms, in both time and quality.
How to Cite
Copyright on articles is held by the author(s). The corresponding author has the right to grant on behalf of all authors and does grant on behalf of all authors, a worldwide exclusive licence (or non-exclusive license for government employees) to the Publishers and its licensees in perpetuity, in all forms, formats and media (whether known now or created in the future)
i) to publish, reproduce, distribute, display and store the Contribution;
ii) to translate the Contribution into other languages, create adaptations, reprints, include within collections and create summaries, extracts and/or, abstracts of the Contribution;
iii) to exploit all subsidiary rights in the Contribution,
iv) to provide the inclusion of electronic links from the Contribution to third party material where-ever it may be located;
v) to licence any third party to do any or all of the above.