Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Speaker recognition through NLP and CWT modeling.

Conference ·
OSTI ID:11824

The objective of this research is to develop a system capable of identifying speakers on wiretaps from a large database (>500 speakers) with a short search time duration (<30 seconds), and with better than 90% accuracy. Much previous research in speaker recognition has led to algorithms that produced encouraging preliminary results, but were overwhelmed when applied to populations of more than a dozen or so different speakers. The authors are investigating a solution to the ''huge population'' problem by seeking two completely different kinds of characterizing features. These features are extracted using the techniques of Neuro-Linguistic Programming (NLP) and the continuous wavelet transform (CWT). NLP extracts precise neurological, verbal and non-verbal information, and assimilates the information into useful patterns. These patterns are based on specific cues demonstrated by each individual, and provide ways of determining congruency between verbal and non-verbal cues. The primary NLP modalities are characterized through word spotting (or verbal predicates cues, e.g., see, sound, feel, etc.) while the secondary modalities would be characterized through the speech transcription used by the individual. This has the practical effect of reducing the size of the search space, and greatly speeding up the process of identifying an unknown speaker. The wavelet-based line of investigation concentrates on using vowel phonemes and non-verbal cues, such as tempo. The rationale for concentrating on vowels is there are a limited number of vowels phonemes, and at least one of them usually appears in even the shortest of speech segments. Using the fast, CWT algorithm, the details of both the formant frequency and the glottal excitation characteristics can be easily extracted from voice waveforms. The differences in the glottal excitation waveforms as well as the formant frequency are evident in the CWT output. More significantly, the CWT reveals significant detail of the glottal excitation waveform.

Research Organization:
Argonne National Lab., IL (US)
Sponsoring Organization:
US Department of Energy (US)
DOE Contract Number:
W-31109-ENG-38
OSTI ID:
11824
Report Number(s):
ANL/ED/CP-99102
Country of Publication:
United States
Language:
English

Similar Records

Speaker Recognition Through NLP and CWT Modeling
Conference · Wed Jun 16 00:00:00 EDT 1999 · OSTI ID:7461

Speaker verification system using acoustic data and non-acoustic data
Patent · Mon Mar 20 23:00:00 EST 2006 · OSTI ID:908544

High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings
Journal Article · Mon Mar 28 00:00:00 EDT 2016 · PLoS ONE · OSTI ID:1379118