MALCOM X: Combining maximum likelihood continuity mapping with Gaussian mixture models
Abstract
GMMs are among the best speaker recognition algorithms currently available. However, the GMM`s estimate of the probability of the speech signal does not change if the authors randomly shuffle the temporal order of the feature vectors, even though the actual probability of observing the shuffled signal would be dramatically different--probably near zero. A potential way to improve the performance of GMMs is to incorporate temporal information into the estimate of the probability of the data. Doing so could improve speech recognition, speaker recognition, and potentially aid in detecting lies (abnormalities) in speech data. As described in other documents (Hogden, 1996), MALCOM is an algorithm that can be used to estimate the probability of a sequence of categorical data. MALCOM can also be applied to speech (and other real valued sequences) if windows of the speech are first categorized using a technique such as vector quantization (Gray, 1984). However, by quantizing the windows of speech, MALCOM ignores information about the within-category differences of the speech windows. Thus, MALCOM and GMMs complement each other: MALCOM is good at using sequence information whereas GMMs capture within-category differences better than the vector quantization typically used by MALCOM. An extension of MALCOM (MALCOM X) thatmore »
- Authors:
- Publication Date:
- Research Org.:
- Los Alamos National Lab., NM (United States)
- Sponsoring Org.:
- USDOE, Washington, DC (United States)
- OSTI Identifier:
- 677150
- Report Number(s):
- LA-UR-98-1378
ON: DE99000844; TRN: AHC29821%%285
- DOE Contract Number:
- W-7405-ENG-36
- Resource Type:
- Technical Report
- Resource Relation:
- Other Information: PBD: [1998]
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 99 MATHEMATICS, COMPUTERS, INFORMATION SCIENCE, MANAGEMENT, LAW, MISCELLANEOUS; SPEECH; ALGORITHMS; WAVE FORMS; PATTERN RECOGNITION; MAXIMUM-LIKELIHOOD FIT; PROBABILITY
Citation Formats
Hogden, J., and Scovel, J.C. MALCOM X: Combining maximum likelihood continuity mapping with Gaussian mixture models. United States: N. p., 1998.
Web. doi:10.2172/677150.
Hogden, J., & Scovel, J.C. MALCOM X: Combining maximum likelihood continuity mapping with Gaussian mixture models. United States. doi:10.2172/677150.
Hogden, J., and Scovel, J.C. Sun .
"MALCOM X: Combining maximum likelihood continuity mapping with Gaussian mixture models". United States.
doi:10.2172/677150. https://www.osti.gov/servlets/purl/677150.
@article{osti_677150,
title = {MALCOM X: Combining maximum likelihood continuity mapping with Gaussian mixture models},
author = {Hogden, J. and Scovel, J.C.},
abstractNote = {GMMs are among the best speaker recognition algorithms currently available. However, the GMM`s estimate of the probability of the speech signal does not change if the authors randomly shuffle the temporal order of the feature vectors, even though the actual probability of observing the shuffled signal would be dramatically different--probably near zero. A potential way to improve the performance of GMMs is to incorporate temporal information into the estimate of the probability of the data. Doing so could improve speech recognition, speaker recognition, and potentially aid in detecting lies (abnormalities) in speech data. As described in other documents (Hogden, 1996), MALCOM is an algorithm that can be used to estimate the probability of a sequence of categorical data. MALCOM can also be applied to speech (and other real valued sequences) if windows of the speech are first categorized using a technique such as vector quantization (Gray, 1984). However, by quantizing the windows of speech, MALCOM ignores information about the within-category differences of the speech windows. Thus, MALCOM and GMMs complement each other: MALCOM is good at using sequence information whereas GMMs capture within-category differences better than the vector quantization typically used by MALCOM. An extension of MALCOM (MALCOM X) that can be used for estimating the probability of a speech sequence is described here.},
doi = {10.2172/677150},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Sun Nov 01 00:00:00 EST 1998},
month = {Sun Nov 01 00:00:00 EST 1998}
}
-
The author describes a novel time-series analysis technique called maximum likelihood continuity mapping (MALCOM), and focuses on one application of MALCOM: detecting fraud in medical insurance claims. Given a training data set composed of typical sequences, MALCOM creates a stochastic model of sequence generation, called a continuity map (CM). A CM maximizes the probability of sequences in the training set given the model constraints, CMs can be used to estimate the likelihood of sequences not found in the training set, enabling anomaly detection and sequence prediction--important aspects of data mining. Since MALCOM can be used on sequences of categorical datamore »
-
LLL-UHMLE: maximum likelihood estimates for the general normal mixture, preliminary user's guide. [In FORTRAN for CDC 7600]
UHMLE is a FORTRAN program to compute maximum likelihood estimates for the parameters (means, covariances, proportions) in a mixture of M multivariate (N-dimensional) normal density functions, given a sample of observation vectors. 1 figure. -
Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding
The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation maymore » -
Statistical Validation of Engineering and Scientific Models: A Maximum Likelihood Based Metric
Two major issues associated with model validation are addressed here. First, we present a maximum likelihood approach to define and evaluate a model validation metric. The advantage of this approach is it is more easily applied to nonlinear problems than the methods presented earlier by Hills and Trucano (1999, 2001); the method is based on optimization for which software packages are readily available; and the method can more easily be extended to handle measurement uncertainty and prediction uncertainty with different probability structures. Several examples are presented utilizing this metric. We show conditions under which this approach reduces to the approachmore » -
Speech processing using maximum likelihood continuity mapping
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.