Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Reducing computation in an i-vector speaker recognition system using a tree-structured universal background model

Journal Article · · Speech Communication
 [1];  [2]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
  2. New Mexico State Univ., Las Cruces, NM (United States)

The majority of state-of-the-art speaker recognition systems (SR) utilize speaker models that are derived from an adapted universal background model (UBM) in the form of a Gaussian mixture model (GMM). This is true for GMM supervector systems, joint factor analysis systems, and most recently i-vector systems. In all of the identified systems, the posterior probabilities and sufficient statistics calculations represent a computational bottleneck in both enrollment and testing. We propose a multi-layered hash system, employing a tree-structured GMM–UBM which uses Runnalls’ Gaussian mixture reduction technique, in order to reduce the number of these calculations. Moreover, with this tree-structured hash, we can trade-off reduction in computation with a corresponding degradation of equal error rate (EER). As an example, we also reduce this computation by a factor of 15× while incurring less than 10% relative degradation of EER (or 0.3% absolute EER) when evaluated with NIST 2010 speaker recognition evaluation (SRE) telephone data.

Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
AC04-94AL85000
OSTI ID:
1140965
Report Number(s):
SAND--2014-2055J; PII: S0167639314000582
Journal Information:
Speech Communication, Journal Name: Speech Communication Journal Issue: C Vol. 66; ISSN 0167-6393
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

Similar Records

Efficient Speaker Verification Using Gaussian Mixture Model Component Clustering
Technical Report · Sat Mar 31 20:00:00 EDT 2012 · OSTI ID:1039402

What the speaker means: the recognition of speakers plans in discourse
Journal Article · Fri Dec 31 23:00:00 EST 1982 · Comput. Math. Appl.; (United States) · OSTI ID:5436353

Speaker recognition through NLP and CWT modeling.
Conference · Wed Jun 23 00:00:00 EDT 1999 · OSTI ID:11824