DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings

Abstract

A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial-especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship across speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocalmore » prosthetics.« less

Authors:
 [1];  [2];  [2];  [2];  [2];  [3];  [2];  [4]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California San Francisco (UCSF), San Francisco, CA (United States)
  2. Univ. of California San Francisco (UCSF), San Francisco, CA (United States)
  3. Univ. of California, Berkeley, CA (United States)
  4. The Univ. of Western Ontario, Ontario (Canada)
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1379118
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
PLoS ONE
Additional Journal Information:
Journal Volume: 11; Journal Issue: 3; Journal ID: ISSN 1932-6203
Publisher:
Public Library of Science
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES; vowels; speech; tongue; acoustics; kinematics; speech signal processing; bioacoustics; lips

Citation Formats

Bouchard, Kristofer E., Conant, David F., Anumanchipalli, Gopala K., Dichter, Benjamin, Chaisanguanthum, Kris S., Johnson, Keith, Chang, Edward F., and Gribble, Paul L. High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings. United States: N. p., 2016. Web. doi:10.1371/journal.pone.0151327.
Bouchard, Kristofer E., Conant, David F., Anumanchipalli, Gopala K., Dichter, Benjamin, Chaisanguanthum, Kris S., Johnson, Keith, Chang, Edward F., & Gribble, Paul L. High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings. United States. https://doi.org/10.1371/journal.pone.0151327
Bouchard, Kristofer E., Conant, David F., Anumanchipalli, Gopala K., Dichter, Benjamin, Chaisanguanthum, Kris S., Johnson, Keith, Chang, Edward F., and Gribble, Paul L. Mon . "High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings". United States. https://doi.org/10.1371/journal.pone.0151327. https://www.osti.gov/servlets/purl/1379118.
@article{osti_1379118,
title = {High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings},
author = {Bouchard, Kristofer E. and Conant, David F. and Anumanchipalli, Gopala K. and Dichter, Benjamin and Chaisanguanthum, Kris S. and Johnson, Keith and Chang, Edward F. and Gribble, Paul L.},
abstractNote = {A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial-especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship across speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocal prosthetics.},
doi = {10.1371/journal.pone.0151327},
journal = {PLoS ONE},
number = 3,
volume = 11,
place = {United States},
year = {Mon Mar 28 00:00:00 EDT 2016},
month = {Mon Mar 28 00:00:00 EDT 2016}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 19 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Factor analysis of tongue shapes
journal, September 1977

  • Harshman, Richard; Ladefoged, Peter; Goldstein, Louis
  • The Journal of the Acoustical Society of America, Vol. 62, Issue 3
  • DOI: 10.1121/1.381581

Factor analysis of tongue shapes
journal, September 1977

  • Harshman, Richard; Ladefoged, Peter; Goldstein, Louis
  • The Journal of the Acoustical Society of America, Vol. 62, Issue 3
  • DOI: 10.1121/1.381581

Linking facial animation, head motion and speech acoustics
journal, July 2002

  • Yehia, Hani C.; Kuratate, Takaaki; Vatikiotis-Bateson, Eric
  • Journal of Phonetics, Vol. 30, Issue 3
  • DOI: 10.1006/jpho.2002.0165

Linking facial animation, head motion and speech acoustics
journal, July 2002

  • Yehia, Hani C.; Kuratate, Takaaki; Vatikiotis-Bateson, Eric
  • Journal of Phonetics, Vol. 30, Issue 3
  • DOI: 10.1006/jpho.2002.0165

Variability in production of the vowels /i/ and /a/
journal, May 1985

  • Perkell, Joseph S.; Nelson, Winston L.
  • The Journal of the Acoustical Society of America, Vol. 77, Issue 5
  • DOI: 10.1121/1.391940

Directional information from neuronal ensembles in the primate orofacial sensorimotor cortex
journal, September 2013

  • Arce, F. I.; Lee, J. -C.; Ross, C. F.
  • Journal of Neurophysiology, Vol. 110, Issue 6
  • DOI: 10.1152/jn.00144.2013

Decoding spoken words using local field potentials recorded from the cortical surface
journal, September 2010


Electrocorticographic representations of segmental features in continuous speech
journal, February 2015

  • Lotte, Fabien; Brumberg, Jonathan S.; Brunner, Peter
  • Frontiers in Human Neuroscience, Vol. 09
  • DOI: 10.3389/fnhum.2015.00097

Vocal tract area functions from magnetic resonance imaging
journal, July 1996

  • Story, Brad H.; Titze, Ingo R.; Hoffman, Eric A.
  • The Journal of the Acoustical Society of America, Vol. 100, Issue 1
  • DOI: 10.1121/1.415960

Control Methods Used in a Study of the Vowels
journal, March 1952

  • Peterson, Gordon E.; Barney, Harold L.
  • The Journal of the Acoustical Society of America, Vol. 24, Issue 2
  • DOI: 10.1121/1.1906875

Dynamics of Vowel Articulation
journal, April 1982


Directional information from neuronal ensembles in the primate orofacial sensorimotor cortex
journal, September 2013

  • Arce, F. I.; Lee, J. -C.; Ross, C. F.
  • Journal of Neurophysiology, Vol. 110, Issue 6
  • DOI: 10.1152/jn.00144.2013

Brain-to-text: Decoding spoken phrases from phone representations in the brain
text, January 2015


Modulation Dynamics in the Orofacial Sensorimotor Cortex during Motor Skill Acquisition
journal, April 2014


An articulatory study of fricative consonants using magnetic resonance imaging
journal, September 1995

  • Narayanan, Shrikanth S.; Alwan, Abeer A.; Haker, Katherine
  • The Journal of the Acoustical Society of America, Vol. 98, Issue 3
  • DOI: 10.1121/1.413469

Individual differences in vowel production
journal, August 1993

  • Johnson, Keith; Ladefoged, Peter; Lindau, Mona
  • The Journal of the Acoustical Society of America, Vol. 94, Issue 2
  • DOI: 10.1121/1.406887

Control of Spoken Vowel Acoustics and the Influence of Phonetic Context in Human Speech Sensorimotor Cortex
journal, September 2014


Automatic data-driven learning of articulatory primitives from real-time MRI data using convolutive NMF with sparseness constraints
conference, August 2011


Acoustical Consequences of Lip, Tongue, Jaw, and Larynx Movement
journal, July 1970

  • Lindblom, B.; Sundberg, J.
  • The Journal of the Acoustical Society of America, Vol. 48, Issue 1A
  • DOI: 10.1121/1.1974958

Temporal Aspects of Articulatory Movements for /s/-Stop Clusters
journal, January 1979


Functional organization of human sensorimotor cortex for speech articulation
journal, February 2013

  • Bouchard, Kristofer E.; Mesgarani, Nima; Johnson, Keith
  • Nature, Vol. 495, Issue 7441
  • DOI: 10.1038/nature11911

Laryngeal vibrations: A comparison between high‐speed filming and glottographic techniques
journal, April 1983

  • Baer, Thomas; Löfqvist, Anders; McGarr, Nancy S.
  • The Journal of the Acoustical Society of America, Vol. 73, Issue 4
  • DOI: 10.1121/1.389279

An approach to real-time magnetic resonance imaging for speech production
journal, April 2004

  • Narayanan, Shrikanth; Nayak, Krishna; Lee, Sungbok
  • The Journal of the Acoustical Society of America, Vol. 115, Issue 4
  • DOI: 10.1121/1.1652588

Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels
journal, August 1991

  • Baer, T.; Gore, J. C.; Gracco, L. C.
  • The Journal of the Acoustical Society of America, Vol. 90, Issue 2
  • DOI: 10.1121/1.401949

Interarticulator programming in VCV sequences: Lip and tongue movements
journal, March 1999

  • Löfqvist, Anders; Gracco, Vincent L.
  • The Journal of the Acoustical Society of America, Vol. 105, Issue 3
  • DOI: 10.1121/1.426723

Patterns of Sounds
book, January 2010


Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
journal, March 2008


Articulatory tradeoffs reduce acoustic variability during American English /r/ production
journal, May 1999

  • Guenther, Frank H.; Espy-Wilson, Carol Y.; Boyce, Suzanne E.
  • The Journal of the Acoustical Society of America, Vol. 105, Issue 5
  • DOI: 10.1121/1.426900

Direct classification of all American English phonemes using signals from functional speech motor cortex
journal, May 2014


Formant estimation method using inverse-filter control
journal, May 2001

  • Watanabe, A.
  • IEEE Transactions on Speech and Audio Processing, Vol. 9, Issue 4
  • DOI: 10.1109/89.917677

Tongue Body Articulation during Vowel and Diphthong Gestures
journal, January 1972

  • Kent, R. D.; Moll, K. L.
  • Folia Phoniatrica et Logopaedica, Vol. 24, Issue 4
  • DOI: 10.1159/000263574

Combinations of muscle synergies in the construction of a natural motor behavior
journal, February 2003

  • d'Avella, Andrea; Saltiel, Philippe; Bizzi, Emilio
  • Nature Neuroscience, Vol. 6, Issue 3
  • DOI: 10.1038/nn1010

Test of the movement expansion model: Anticipatory vowel lip protrusion and constriction in French and English speakers
journal, January 2011

  • Noiray, Aude; Cathiard, Marie-Agnès; Ménard, Lucie
  • The Journal of the Acoustical Society of America, Vol. 129, Issue 1
  • DOI: 10.1121/1.3518452

Acoustic characteristics of American English vowels
journal, May 1995

  • Hillenbrand, James; Getty, Laura A.; Clark, Michael J.
  • The Journal of the Acoustical Society of America, Vol. 97, Issue 5
  • DOI: 10.1121/1.411872

Learning the parts of objects by non-negative matrix factorization
journal, October 1999

  • Lee, Daniel D.; Seung, H. Sebastian
  • Nature, Vol. 401, Issue 6755
  • DOI: 10.1038/44565

Variant and invariant characteristics of speech movements
journal, December 1986

  • Gracco, V. L.; Abbs, J. H.
  • Experimental Brain Research, Vol. 65, Issue 1
  • DOI: 10.1007/bf00243838

Speech motor coordination and control: evidence from lip, jaw, and laryngeal movements
journal, November 1994


Automatic contour tracking in ultrasound images
journal, January 2005

  • Li, Min; Kambhamettu, Chandra; Stone, Maureen
  • Clinical Linguistics & Phonetics, Vol. 19, Issue 6-7
  • DOI: 10.1080/02699200500113616

Trading relations between tongue‐body raising and lip rounding in production of the vowel /u/: A pilot ‘‘motor equivalence’’ study
journal, May 1993

  • Perkell, Joseph S.; Matthies, Melanie L.; Svirsky, Mario A.
  • The Journal of the Acoustical Society of America, Vol. 93, Issue 5
  • DOI: 10.1121/1.405814

Gestural specification using dynamically-defined articulatory structures
journal, July 1990


A real-time formant tracker based on the inverse filter control method
journal, January 2007

  • Ueda, Yuichi; Hamakawa, Tomoya; Sakata, Tadashi
  • Acoustical Science and Technology, Vol. 28, Issue 4
  • DOI: 10.1250/ast.28.271

Trading relations between tongue‐body raising and lip rounding in production of the vowel /u/: A pilot ‘‘motor equivalence’’ study
journal, May 1993

  • Perkell, Joseph S.; Matthies, Melanie L.; Svirsky, Mario A.
  • The Journal of the Acoustical Society of America, Vol. 93, Issue 5
  • DOI: 10.1121/1.405814

Tongue Body Articulation during Vowel and Diphthong Gestures
journal, January 1972

  • Kent, R. D.; Moll, K. L.
  • Folia Phoniatrica et Logopaedica, Vol. 24, Issue 4
  • DOI: 10.1159/000263574

Some Basic Considerations in the Analysis of Intonation
journal, April 1961

  • Lehiste, Ilse; Peterson, Gordon E.
  • The Journal of the Acoustical Society of America, Vol. 33, Issue 4
  • DOI: 10.1121/1.1908681

Brain–computer interfaces for speech communication
journal, April 2010

  • Brumberg, Jonathan S.; Nieto-Castanon, Alfonso; Kennedy, Philip R.
  • Speech Communication, Vol. 52, Issue 4
  • DOI: 10.1016/j.specom.2010.01.001

Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements
journal, December 1992

  • Perkell, Joseph S.; Cohen, Marc H.; Svirsky, Mario A.
  • The Journal of the Acoustical Society of America, Vol. 92, Issue 6
  • DOI: 10.1121/1.404204

An approach to real-time magnetic resonance imaging for speech production
journal, April 2004

  • Narayanan, Shrikanth; Nayak, Krishna; Lee, Sungbok
  • The Journal of the Acoustical Society of America, Vol. 115, Issue 4
  • DOI: 10.1121/1.1652588

The SIGMA Algorithm: A Glottal Activity Detector for Electroglottographic Signals
journal, November 2009

  • Thomas, M. R. P.; Naylor, P. A.
  • IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, Issue 8
  • DOI: 10.1109/TASL.2009.2022430

Articulatory tradeoffs reduce acoustic variability during American English /r/ production
journal, May 1999

  • Guenther, Frank H.; Espy-Wilson, Carol Y.; Boyce, Suzanne E.
  • The Journal of the Acoustical Society of America, Vol. 105, Issue 5
  • DOI: 10.1121/1.426900

Temporal Aspects of Articulatory Movements for /s/-Stop Clusters
journal, January 1979


Direct classification of all American English phonemes using signals from functional speech motor cortex
journal, May 2014


Acoustical Consequences of Lip, Tongue, Jaw, and Larynx Movement
journal, October 1971

  • Lindblom, Björn E. F.; Sundberg, Johan E. F.
  • The Journal of the Acoustical Society of America, Vol. 50, Issue 4B
  • DOI: 10.1121/1.1912750

Laryngeal vibrations: A comparison between high‐speed filming and glottographic techniques
journal, April 1983

  • Baer, Thomas; Löfqvist, Anders; McGarr, Nancy S.
  • The Journal of the Acoustical Society of America, Vol. 73, Issue 4
  • DOI: 10.1121/1.389279

Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis
journal, August 2009

  • Yamagishi, Junichi; Nose, Takashi; Zen, Heiga
  • IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, Issue 6
  • DOI: 10.1109/TASL.2009.2016394

Interarticulator programming in VCV sequences: Lip and tongue movements
journal, March 1999

  • Löfqvist, Anders; Gracco, Vincent L.
  • The Journal of the Acoustical Society of America, Vol. 105, Issue 3
  • DOI: 10.1121/1.426723

Analysis of vocal tract shape and dimensions using magnetic resonance imaging: Vowels
journal, August 1991

  • Baer, T.; Gore, J. C.; Gracco, L. C.
  • The Journal of the Acoustical Society of America, Vol. 90, Issue 2
  • DOI: 10.1121/1.401949

Combinations of muscle synergies in the construction of a natural motor behavior
journal, February 2003

  • d'Avella, Andrea; Saltiel, Philippe; Bizzi, Emilio
  • Nature Neuroscience, Vol. 6, Issue 3
  • DOI: 10.1038/nn1010

Acoustic characteristics of American English vowels
journal, May 1994

  • Hillenbrand, James; Getty, Laura A.; Wheeler, Kimberlee
  • The Journal of the Acoustical Society of America, Vol. 95, Issue 5
  • DOI: 10.1121/1.409456

When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts?
text, January 2004

  • Donoho, David L.; Stodden, Victoria C.
  • Columbia University
  • DOI: 10.7916/d88d05n7

Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm
journal, January 2007

  • Naylor, Patrick A.; Kounoudes, Anastasis; Gudnason, Jon
  • IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, Issue 1
  • DOI: 10.1109/TASL.2006.876878

Test of the movement expansion model: Anticipatory vowel lip protrusion and constriction in French and English speakers
journal, January 2011

  • Noiray, Aude; Cathiard, Marie-Agnès; Ménard, Lucie
  • The Journal of the Acoustical Society of America, Vol. 129, Issue 1
  • DOI: 10.1121/1.3518452

Three‐dimensional tongue surface shapes of English consonants and vowels
journal, June 1996

  • Stone, Maureen; Lundberg, Andrew
  • The Journal of the Acoustical Society of America, Vol. 99, Issue 6
  • DOI: 10.1121/1.414969

Some Basic Considerations in the Analysis of Intonation
journal, April 1961

  • Lehiste, Ilse; Peterson, Gordon E.
  • The Journal of the Acoustical Society of America, Vol. 33, Issue 4
  • DOI: 10.1121/1.1908681

Random Forests
journal, January 2001


Vocal tract area functions from magnetic resonance imaging
journal, July 1996

  • Story, Brad H.; Titze, Ingo R.; Hoffman, Eric A.
  • The Journal of the Acoustical Society of America, Vol. 100, Issue 1
  • DOI: 10.1121/1.415960

Control Methods Used in a Study of the Vowels
journal, March 1952

  • Peterson, Gordon E.; Barney, Harold L.
  • The Journal of the Acoustical Society of America, Vol. 24, Issue 2
  • DOI: 10.1121/1.1906875

Statistical parametric speech synthesis
journal, November 2009


Brain–computer interfaces for speech communication
journal, April 2010

  • Brumberg, Jonathan S.; Nieto-Castanon, Alfonso; Kennedy, Philip R.
  • Speech Communication, Vol. 52, Issue 4
  • DOI: 10.1016/j.specom.2010.01.001

Electrocorticographic representations of segmental features in continuous speech
journal, February 2015

  • Lotte, Fabien; Brumberg, Jonathan S.; Brunner, Peter
  • Frontiers in Human Neuroscience, Vol. 09
  • DOI: 10.3389/fnhum.2015.00097

Works referencing / citing this record:

Formant Space Reconstruction From Brain Activity in Frontal and Temporal Regions Coding for Heard Vowels
journal, February 2019

  • Rampinini, Alessandra Cecilia; Handjaras, Giacomo; Leo, Andrea
  • Frontiers in Human Neuroscience, Vol. 13
  • DOI: 10.3389/fnhum.2019.00032

Parkinson Disease Detection from Speech Articulation Neuromechanics
journal, August 2017

  • Gómez-Vilda, Pedro; Mekyska, Jiri; Ferrández, José M.
  • Frontiers in Neuroinformatics, Vol. 11
  • DOI: 10.3389/fninf.2017.00056

Discrete Anatomical Coordinates for Speech Production and Synthesis
journal, April 2019

  • Assaneo, M. Florencia; Ramirez Butavand, Daniela; Trevisan, Marcos A.
  • Frontiers in Communication, Vol. 4
  • DOI: 10.3389/fcomm.2019.00013

Subthalamic Nucleus and Sensorimotor Cortex Activity During Speech Production
journal, January 2019


Workshops of the Sixth International Brain–Computer Interface Meeting: brain–computer interfaces past, present, and future
journal, January 2017


Discrete anatomical coordinates for speech production and synthesis
journal, June 2017

  • Assaneo, M. Florencia; Butavand, Daniela Ramirez; Trevisan, Marcos A.
  • Frontiers in Communication
  • DOI: 10.1101/148007

Workshops of the seventh international brain-computer interface meeting: not getting lost in translation
journal, July 2019


Monitoring ALS from speech articulation kinematics
journal, May 2018

  • Gómez, Pedro; Londral, Ana R. M.; Gómez, Andrés
  • Neural Computing and Applications, Vol. 32, Issue 20
  • DOI: 10.1007/s00521-018-3538-6

Vocal Tract Images Reveal Neural Representations of Sensorimotor Transformation During Speech Imitation
journal, March 2017

  • Carey, Daniel; Miquel, Marc E.; Evans, Bronwen G.
  • Cerebral Cortex, Vol. 27, Issue 5
  • DOI: 10.1093/cercor/bhx056

Workshops of the Sixth International Brain–Computer Interface Meeting: brain–computer interfaces past, present, and future
journal, January 2017


A Neuromotor to Acoustical Jaw-Tongue Projection Model With Application in Parkinson’s Disease Hypokinetic Dysarthria
journal, March 2021