A new relativistic vision in speaker discrimination
Keywords:Classifiers, Learning systems, Neural networks, Speech recognition, Discrimination accuracies, Document indexing, Learning time, Multi-Layer Perceptron, Neural network classifiers, New models, Speaker models, Speaker verifications, Speech database, Speech signals
AbstractThe present paper deals with the task of speaker discrimination using a new relativistic approach. Speaker discrimination has two practical applications: speaker verification and audio document indexing. In such applications, the speaker model is extracted directly from speaker's own speech signal as well as using speaker's own features. However, such a model can be rigid, inaccurate and not appropriate in fluctuating environments where a change in the recording conditions may occur. For instance, during telephone talks, the vocal features for the same speaker may change considerably. And hence, a new relative speaker model is introduced. The new model is based on a relative characterization of the speaker, called Relative Speaker Characteristic (RSC). RSC consists in modeling one speaker relative to another, meaning that each speaker model needs both its speech signal and its competing speech (speech of the speaker to be compared with). This investigation shows that the relative model, used as input at a neural network classifier, optimizes the training of the classifier, speeds up its learning time and also enhances the discrimination accuracy. The experiments of speaker discrimination are done on two different databases: Hub4 Broadcast-News database and a telephonic speech database by using a Multi-Layer Perceptron (MLP) with several input characteristics. Results indicate that the best characteristic is the RSC, when compared to other reduced features evaluated in the same manner.
How to Cite
Copyright on articles is held by the author(s). The corresponding author has the right to grant on behalf of all authors and does grant on behalf of all authors, a worldwide exclusive licence (or non-exclusive license for government employees) to the Publishers and its licensees in perpetuity, in all forms, formats and media (whether known now or created in the future)
i) to publish, reproduce, distribute, display and store the Contribution;
ii) to translate the Contribution into other languages, create adaptations, reprints, include within collections and create summaries, extracts and/or, abstracts of the Contribution;
iii) to exploit all subsidiary rights in the Contribution,
iv) to provide the inclusion of electronic links from the Contribution to third party material where-ever it may be located;
v) to licence any third party to do any or all of the above.