Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy

Journal Article · · Neuron
 [1];  [2];  [3];  [3];  [4]
  1. Massachusetts Institute of Technology (MIT), Cambridge, MA (United States); DOE/OSTI
  2. Stanford University, CA (United States); Stanford Neurosciences Institute, Stanford, CA (United States)
  3. Massachusetts Institute of Technology (MIT), Cambridge, MA (United States)
  4. Massachusetts Institute of Technology (MIT), Cambridge, MA (United States); Harvard University, Cambridge, MA (United States)

A core goal of auditory neuroscience is to build quantitative models that predict cortical responses to natural sounds. Reasoning that a complete model of auditory cortex must solve ecologically relevant tasks, we optimized hierarchical neural networks for speech and music recognition. The best-performing network contained separate music and speech pathways following early shared processing, potentially replicating human cortical organization. The network performed both tasks as well as humans and exhibited human-like errors despite not being optimized to do so, suggesting common constraints on network and human performance. The network predicted fMRI voxel responses substantially better than traditional spectrotemporal filter models throughout auditory cortex. It also provided a quantitative signature of cortical representational hierarchy—primary and non-primary responses were best predicted by intermediate and late network layers, respectively. Here the results suggest that task optimization provides a powerful set of tools for modeling sensory systems.

Research Organization:
Krell Institute, Ames, IA (United States); Massachusetts Institute of Technology (MIT), Cambridge, MA (United States)
Sponsoring Organization:
USDOE Office of Science (SC); NVIDIA Corporation; National Institutes of Health (NIH); McDonnell Scholar Award; National Science Foundation (NSF)
Grant/Contract Number:
FG02-97ER25308
OSTI ID:
1538638
Journal Information:
Neuron, Journal Name: Neuron Journal Issue: 3 Vol. 98; ISSN 0896-6273
Publisher:
Cell Press - ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (67)

Human Primary Auditory Cortex: Cytoarchitectonic Subdivisions and Mapping into a Spatial Reference System journal April 2001
Intrahemispheric cortico-cortical connections of the human auditory cortex journal August 2014
Derivation of auditory filter shapes from notched-noise data journal August 1990
Serial and parallel processing in the primate auditory cortex revisited journal January 2010
Music listening engages specific cortical regions within the temporal lobes: Differences between musicians and non-musicians journal October 2014
Reservoir computing approaches to recurrent neural network training journal August 2009
Sound Categories Are Represented as Distributed Patterns in the Human Auditory Cortex journal March 2009
Locating the initial stages of speech–sound processing in human temporal cortex journal July 2006
Tonotopic organization of human auditory cortex journal April 2010
Encoding and decoding in fMRI journal May 2011
FreeSurfer journal August 2012
Seeing it all: Convolutional network layers map the function of the human visual system journal May 2017
Reduction of Information Redundancy in the Ascending Auditory Pathway journal August 2006
Sound Texture Perception via Statistics of the Auditory Periphery: Evidence from Sound Synthesis journal September 2011
Distinct Cortical Pathways for Music and Speech Revealed by Hypothesis-Free Voxel Decomposition journal December 2015
A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons journal February 1988
Network model of shape-from-shading: neural function arises from both receptive and projective fields journal June 1988
Deep learning journal May 2015
Natural speech reveals the semantic maps that tile human cerebral cortex journal April 2016
Unraveling the principles of auditory cortical processing: can we learn from the visual system? journal May 2009
Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing journal May 2009
Categorical speech representation in human superior temporal gyrus journal October 2010
Summary statistics in auditory perception journal February 2013
The cortical analysis of speech-specific temporal structure revealed by responses to sound quilts journal May 2015
Using goal-driven deep learning models to understand sensory cortex journal February 2016
Hierarchical and asymmetric temporal sensitivity in human auditory cortices journal February 2005
Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence journal June 2016
Spectro-temporal modulation transfer function of single voxels in the human auditory cortex measured with high-resolution fMRI journal August 2009
Neural latencies across auditory cortex of macaque support a dorsal stream supramodal timing advantage in primates journal October 2012
Performance-optimized hierarchical models predict neural responses in higher visual cortex journal May 2014
Subdivisions of auditory cortex and processing streams in primates journal October 2000
Neural Substrates of Phonemic Perception journal February 2005
Hierarchical Organization of Human Auditory Cortex: Evidence from Acoustic Invariance in the Response to Intelligible Speech journal January 2010
Speech versus Song: Multiple Pitch-Sensitive Areas Revealed by a Naturally Occurring Musical Illusion journal February 2012
Hierarchical Organization of Auditory and Motor Representations in Speech Perception: Evidence from Searchlight Similarity Analysis journal July 2015
Learning Transferable Architectures for Scalable Image Recognition conference June 2018
CNN Features Off-the-Shelf: An Astounding Baseline for Recognition conference June 2014
Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation conference March 2008
CNN architectures for large-scale audio classification conference March 2017
Robust Formant Tracking for Continuous Speech With Speaker Variability journal March 2006
Multiresolution spectrotemporal analysis of complex sounds journal August 2005
Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers journal November 1997
"Who" Is Saying "What"? Brain-Based Decoding of Human Voice and Speech journal November 2008
Phonetic Feature Encoding in Human Superior Temporal Gyrus journal February 2014
Processing of complex sounds in the macaque nonprimary auditory cortex journal April 1995
Spectrotemporal Receptive Fields in the Lemniscal Auditory Thalamus and Cortex journal January 2002
Receptive field dimensionality increases from the auditory midbrain to cortex journal May 2012
Spectro-Temporal Response Field Characterization With Dynamic Ripples in Ferret Primary Auditory Cortex journal March 2001
Hierarchical Organization of the Human Auditory Cortex Revealed by Functional Magnetic Resonance Imaging journal January 2001
Learning Midlevel Auditory Codes from Natural Sound Statistics journal March 2018
The Psychophysics Toolbox journal January 1997
A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation journal November 2009
Sparse Codes for Speech Predict Spectrotemporal Receptive Fields in the Inferior Colliculus journal July 2012
Music in Our Ears: The Biological Bases of Musical Timbre Perception journal November 2012
Encoding of Natural Sounds at Multiple Spectral and Temporal Resolutions in the Human Auditory Cortex journal January 2014
Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation journal November 2014
Cortical Representation of Natural Complex Sounds: Effects of Acoustic Features and Auditory Object Category journal June 2010
The Consequences of Response Nonlinearities for Interpretation of Spectrotemporal Receptive Fields journal January 2008
Cortical Pitch Regions in Humans Respond Primarily to Resolved Harmonics and Are Located in Specific Tonotopic Regions of Anterior Auditory Cortex journal December 2013
The Hierarchical Cortical Organization of Human Speech Processing journal June 2017
Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream journal July 2015
Rapid Synaptic Depression Explains Nonlinear Modulation of Spectro-Temporal Tuning in Primary Auditory Cortex by Natural Stimuli journal March 2009
The Proof and Measurement of Association between Two Things journal January 1904
The design for the wall street journal-based CSR corpus conference January 1992
Measuring the Performance of Neural Models journal February 2016
Segregation of Vowels and Consonants in Human Auditory Cortex: Evidence for Distributed Hierarchical Organization journal January 2010
Headphone screening to facilitate web-based auditory experiments journal July 2017

Cited By (2)

Deep neuroethology of a virtual rodent preprint January 2019
Predictive Coding Can Do Exact Backpropagation on Convolutional and Recurrent Neural Networks preprint January 2021