Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Speaker Recognition Through NLP and CWT Modeling

Conference ·
OSTI ID:7461

The objective of this research is to develop a system capable of identifying speakers on wiretaps from a large database (>500 speakers) with a short search time duration (<30 seconds), and with better than 90% accuracy. Much previous research in speaker recognition has led to algorithms that produced encouraging preliminary results, but were overwhelmed when applied to populations of more than a dozen or so different speakers. The authors are investigating a solution to the "large population" problem by seeking two completely different kinds of characterizing features. These features are he techniques of Neuro-Linguistic Programming (NLP) and the continuous wavelet transform (CWT). NLP extracts precise neurological, verbal and non-verbal information, and assimilates the information into useful patterns. These patterns are based on specific cues demonstrated by each individual, and provide ways of determining congruency between verbal and non-verbal cues. The primary NLP modalities are characterized through word spotting (or verbal predicates cues, e.g., see, sound, feel, etc.) while the secondary modalities would be characterized through the speech transcription used by the individual. This has the practical effect of reducing the size of the search space, and greatly speeding up the process of identifying an unknown speaker. The wavelet-based line of investigation concentrates on using vowel phonemes and non-verbal cues, such as tempo. The rationale for concentrating on vowels is there are a limited number of vowels phonemes, and at least one of them usually appears in even the shortest of speech segments. Using the fast, CWT algorithm, the details of both the formant frequency and the glottal excitation characteristics can be easily extracted from voice waveforms. The differences in the glottal excitation waveforms as well as the formant frequency are evident in the CWT output. More significantly, the CWT reveals significant detail of the glottal excitation waveform.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN
Sponsoring Organization:
USDOE Office of Science
DOE Contract Number:
AC05-96OR22464
OSTI ID:
7461
Report Number(s):
ORNL/CP-103182; ON: DE00007461
Country of Publication:
United States
Language:
English

Similar Records

Speaker recognition through NLP and CWT modeling.
Conference · Wed Jun 23 00:00:00 EDT 1999 · OSTI ID:11824

Speaker verification system using acoustic data and non-acoustic data
Patent · Mon Mar 20 23:00:00 EST 2006 · OSTI ID:908544

High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings
Journal Article · Mon Mar 28 00:00:00 EDT 2016 · PLoS ONE · OSTI ID:1379118