skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings

Abstract

A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial-especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship across speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocalmore » prosthetics.« less

Authors:
 [1];  [2];  [2];  [2];  [2];  [3];  [2];  [4]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California San Francisco (UCSF), San Francisco, CA (United States)
  2. Univ. of California San Francisco (UCSF), San Francisco, CA (United States)
  3. Univ. of California, Berkeley, CA (United States)
  4. The Univ. of Western Ontario, Ontario (Canada)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1379118
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
PLoS ONE
Additional Journal Information:
Journal Volume: 11; Journal Issue: 3; Journal ID: ISSN 1932-6203
Publisher:
Public Library of Science
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES; vowels; speech; tongue; acoustics; kinematics; speech signal processing; bioacoustics; lips

Citation Formats

Bouchard, Kristofer E., Conant, David F., Anumanchipalli, Gopala K., Dichter, Benjamin, Chaisanguanthum, Kris S., Johnson, Keith, Chang, Edward F., and Gribble, Paul L. High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings. United States: N. p., 2016. Web. doi:10.1371/journal.pone.0151327.
Bouchard, Kristofer E., Conant, David F., Anumanchipalli, Gopala K., Dichter, Benjamin, Chaisanguanthum, Kris S., Johnson, Keith, Chang, Edward F., & Gribble, Paul L. High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings. United States. doi:10.1371/journal.pone.0151327.
Bouchard, Kristofer E., Conant, David F., Anumanchipalli, Gopala K., Dichter, Benjamin, Chaisanguanthum, Kris S., Johnson, Keith, Chang, Edward F., and Gribble, Paul L. Mon . "High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings". United States. doi:10.1371/journal.pone.0151327. https://www.osti.gov/servlets/purl/1379118.
@article{osti_1379118,
title = {High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings},
author = {Bouchard, Kristofer E. and Conant, David F. and Anumanchipalli, Gopala K. and Dichter, Benjamin and Chaisanguanthum, Kris S. and Johnson, Keith and Chang, Edward F. and Gribble, Paul L.},
abstractNote = {A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial-especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship across speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocal prosthetics.},
doi = {10.1371/journal.pone.0151327},
journal = {PLoS ONE},
number = 3,
volume = 11,
place = {United States},
year = {2016},
month = {3}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 12 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Factor analysis of tongue shapes
journal, September 1977

  • Harshman, Richard; Ladefoged, Peter; Goldstein, Louis
  • The Journal of the Acoustical Society of America, Vol. 62, Issue 3
  • DOI: 10.1121/1.381581

Linking facial animation, head motion and speech acoustics
journal, July 2002

  • Yehia, Hani C.; Kuratate, Takaaki; Vatikiotis-Bateson, Eric
  • Journal of Phonetics, Vol. 30, Issue 3
  • DOI: 10.1006/jpho.2002.0165

Trading relations between tongue‐body raising and lip rounding in production of the vowel /u/: A pilot ‘‘motor equivalence’’ study
journal, May 1993

  • Perkell, Joseph S.; Matthies, Melanie L.; Svirsky, Mario A.
  • The Journal of the Acoustical Society of America, Vol. 93, Issue 5
  • DOI: 10.1121/1.405814

Variability in production of the vowels /i/ and /a/
journal, May 1985

  • Perkell, Joseph S.; Nelson, Winston L.
  • The Journal of the Acoustical Society of America, Vol. 77, Issue 5
  • DOI: 10.1121/1.391940

Modulation Dynamics in the Orofacial Sensorimotor Cortex during Motor Skill Acquisition
journal, April 2014


Directional information from neuronal ensembles in the primate orofacial sensorimotor cortex
journal, September 2013

  • Arce, F. I.; Lee, J. -C.; Ross, C. F.
  • Journal of Neurophysiology, Vol. 110, Issue 6
  • DOI: 10.1152/jn.00144.2013

Decoding spoken words using local field potentials recorded from the cortical surface
journal, September 2010


Electrocorticographic representations of segmental features in continuous speech
journal, February 2015

  • Lotte, Fabien; Brumberg, Jonathan S.; Brunner, Peter
  • Frontiers in Human Neuroscience, Vol. 09
  • DOI: 10.3389/fnhum.2015.00097

Vocal tract area functions from magnetic resonance imaging
journal, July 1996

  • Story, Brad H.; Titze, Ingo R.; Hoffman, Eric A.
  • The Journal of the Acoustical Society of America, Vol. 100, Issue 1
  • DOI: 10.1121/1.415960

A real-time formant tracker based on the inverse filter control method
journal, January 2007

  • Ueda, Yuichi; Hamakawa, Tomoya; Sakata, Tadashi
  • Acoustical Science and Technology, Vol. 28, Issue 4
  • DOI: 10.1250/ast.28.271

Tongue Body Articulation during Vowel and Diphthong Gestures
journal, January 1972

  • Kent, R. D.; Moll, K. L.
  • Folia Phoniatrica et Logopaedica, Vol. 24, Issue 4
  • DOI: 10.1159/000263574

Some Basic Considerations in the Analysis of Intonation
journal, April 1961

  • Lehiste, Ilse; Peterson, Gordon E.
  • The Journal of the Acoustical Society of America, Vol. 33, Issue 4
  • DOI: 10.1121/1.1908681

An articulatory study of fricative consonants using magnetic resonance imaging
journal, September 1995

  • Narayanan, Shrikanth S.; Alwan, Abeer A.; Haker, Katherine
  • The Journal of the Acoustical Society of America, Vol. 98, Issue 3
  • DOI: 10.1121/1.413469

Individual differences in vowel production
journal, August 1993

  • Johnson, Keith; Ladefoged, Peter; Lindau, Mona
  • The Journal of the Acoustical Society of America, Vol. 94, Issue 2
  • DOI: 10.1121/1.406887

Brain–computer interfaces for speech communication
journal, April 2010

  • Brumberg, Jonathan S.; Nieto-Castanon, Alfonso; Kennedy, Philip R.
  • Speech Communication, Vol. 52, Issue 4
  • DOI: 10.1016/j.specom.2010.01.001

Control of Spoken Vowel Acoustics and the Influence of Phonetic Context in Human Speech Sensorimotor Cortex
journal, September 2014


Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements
journal, December 1992

  • Perkell, Joseph S.; Cohen, Marc H.; Svirsky, Mario A.
  • The Journal of the Acoustical Society of America, Vol. 92, Issue 6
  • DOI: 10.1121/1.404204

The SIGMA Algorithm: A Glottal Activity Detector for Electroglottographic Signals
journal, November 2009

  • Thomas, M. R. P.; Naylor, P. A.
  • IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, Issue 8
  • DOI: 10.1109/TASL.2009.2022430

Functional organization of human sensorimotor cortex for speech articulation
journal, February 2013

  • Bouchard, Kristofer E.; Mesgarani, Nima; Johnson, Keith
  • Nature, Vol. 495, Issue 7441
  • DOI: 10.1038/nature11911

Laryngeal vibrations: A comparison between high‐speed filming and glottographic techniques
journal, April 1983

  • Baer, Thomas; Löfqvist, Anders; McGarr, Nancy S.
  • The Journal of the Acoustical Society of America, Vol. 73, Issue 4
  • DOI: 10.1121/1.389279

An approach to real-time magnetic resonance imaging for speech production
journal, April 2004

  • Narayanan, Shrikanth; Nayak, Krishna; Lee, Sungbok
  • The Journal of the Acoustical Society of America, Vol. 115, Issue 4
  • DOI: 10.1121/1.1652588

Interarticulator programming in VCV sequences: Lip and tongue movements
journal, March 1999

  • Löfqvist, Anders; Gracco, Vincent L.
  • The Journal of the Acoustical Society of America, Vol. 105, Issue 3
  • DOI: 10.1121/1.426723

Articulatory tradeoffs reduce acoustic variability during American English /r/ production
journal, May 1999

  • Guenther, Frank H.; Espy-Wilson, Carol Y.; Boyce, Suzanne E.
  • The Journal of the Acoustical Society of America, Vol. 105, Issue 5
  • DOI: 10.1121/1.426900

Temporal Aspects of Articulatory Movements for /s/-Stop Clusters
journal, January 1979


Variant and invariant characteristics of speech movements
journal, December 1986

  • Gracco, V. L.; Abbs, J. H.
  • Experimental Brain Research, Vol. 65, Issue 1
  • DOI: 10.1007/BF00243838

Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
journal, March 2008


Acoustical Consequences of Lip, Tongue, Jaw, and Larynx Movement
journal, October 1971

  • Lindblom, Björn E. F.; Sundberg, Johan E. F.
  • The Journal of the Acoustical Society of America, Vol. 50, Issue 4B
  • DOI: 10.1121/1.1912750

Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis
journal, August 2009

  • Yamagishi, Junichi; Nose, Takashi; Zen, Heiga
  • IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, Issue 6
  • DOI: 10.1109/TASL.2009.2016394

Direct classification of all American English phonemes using signals from functional speech motor cortex
journal, May 2014


Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels
journal, August 1991

  • Baer, T.; Gore, J. C.; Gracco, L. C.
  • The Journal of the Acoustical Society of America, Vol. 90, Issue 2
  • DOI: 10.1121/1.401949

Formant estimation method using inverse-filter control
journal, May 2001

  • Watanabe, A.
  • IEEE Transactions on Speech and Audio Processing, Vol. 9, Issue 4
  • DOI: 10.1109/89.917677

Combinations of muscle synergies in the construction of a natural motor behavior
journal, February 2003

  • d'Avella, Andrea; Saltiel, Philippe; Bizzi, Emilio
  • Nature Neuroscience, Vol. 6, Issue 3
  • DOI: 10.1038/nn1010

Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm
journal, January 2007

  • Naylor, Patrick A.; Kounoudes, Anastasis; Gudnason, Jon
  • IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, Issue 1
  • DOI: 10.1109/TASL.2006.876878

Test of the movement expansion model: Anticipatory vowel lip protrusion and constriction in French and English speakers
journal, January 2011

  • Noiray, Aude; Cathiard, Marie-Agnès; Ménard, Lucie
  • The Journal of the Acoustical Society of America, Vol. 129, Issue 1
  • DOI: 10.1121/1.3518452

Three‐dimensional tongue surface shapes of English consonants and vowels
journal, June 1996

  • Stone, Maureen; Lundberg, Andrew
  • The Journal of the Acoustical Society of America, Vol. 99, Issue 6
  • DOI: 10.1121/1.414969

Acoustic characteristics of American English vowels
journal, May 1995

  • Hillenbrand, James; Getty, Laura A.; Clark, Michael J.
  • The Journal of the Acoustical Society of America, Vol. 97, Issue 5
  • DOI: 10.1121/1.411872

Learning the parts of objects by non-negative matrix factorization
journal, October 1999

  • Lee, Daniel D.; Seung, H. Sebastian
  • Nature, Vol. 401, Issue 6755
  • DOI: 10.1038/44565

Random Forests
journal, January 2001


Control Methods Used in a Study of the Vowels
journal, March 1952

  • Peterson, Gordon E.; Barney, Harold L.
  • The Journal of the Acoustical Society of America, Vol. 24, Issue 2
  • DOI: 10.1121/1.1906875

Statistical parametric speech synthesis
journal, November 2009


Automatic contour tracking in ultrasound images
journal, January 2005

  • Li, Min; Kambhamettu, Chandra; Stone, Maureen
  • Clinical Linguistics & Phonetics, Vol. 19, Issue 6-7
  • DOI: 10.1080/02699200500113616