Context-Dependent Piano Music Transcription With Convolutional Sparse Coding

Cogliati, Andrea; Duan, Zhiyao; Wohlberg, Brendt

doi:10.1109/TASLP.2016.2598305

Title: Context-Dependent Piano Music Transcription With Convolutional Sparse Coding

Abstract

This study presents a novel approach to automatic transcription of piano music in a context-dependent setting. This approach employs convolutional sparse coding to approximate the music waveform as the summation of piano note waveforms (dictionary elements) convolved with their temporal activations (onset transcription). The piano note waveforms are pre-recorded for the specific piano to be transcribed in the specific environment. During transcription, the note waveforms are fixed and their temporal activations are estimated and post-processed to obtain the pitch and onset transcription. This approach works in the time domain, models temporal evolution of piano notes, and estimates pitches and onsets simultaneously in the same framework. Finally, experiments show that it significantly outperforms a state-of-the-art music transcription method trained in the same context-dependent setting, in both transcription accuracy and time precision, in various scenarios including synthetic, anechoic, noisy, and reverberant environments.

Authors:

Cogliati, Andrea ^[1]; Duan, Zhiyao ^[1]; Wohlberg, Brendt ^[2]

Univ. of Rochester, NY (United States). Dept. of Electrical and Computer Engineering
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

Publication Date:: Thu Aug 04 00:00:00 EDT 2016

Research Org.:: Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

Sponsoring Org.:: USDOE

OSTI Identifier:: 1457305

Report Number(s):: LA-UR-15-29587
Journal ID: ISSN 2329-9290

Grant/Contract Number:: AC52-06NA25396

Resource Type:: Accepted Manuscript

Journal Name:: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Additional Journal Information:: Journal Volume: 24; Journal Issue: 12; Journal ID: ISSN 2329-9290

Publisher:: IEEE - ACM

Country of Publication:: United States

Language:: English

Subject:: 47 OTHER INSTRUMENTATION; automatic music transcription; piano transcription; reverberation; convolutional sparse coding

Citation Formats


                    Cogliati, Andrea, Duan, Zhiyao, and Wohlberg, Brendt. Context-Dependent Piano Music Transcription With Convolutional Sparse Coding.  United States: N. p., 2016. 
Web.  doi:10.1109/TASLP.2016.2598305.

Copy to clipboard


                    Cogliati, Andrea, Duan, Zhiyao, & Wohlberg, Brendt. Context-Dependent Piano Music Transcription With Convolutional Sparse Coding.  United States.  https://doi.org/10.1109/TASLP.2016.2598305

Copy to clipboard


                    Cogliati, Andrea, Duan, Zhiyao, and Wohlberg, Brendt. Thu .  
"Context-Dependent Piano Music Transcription With Convolutional Sparse Coding".  United States.  https://doi.org/10.1109/TASLP.2016.2598305.  https://www.osti.gov/servlets/purl/1457305.

Copy to clipboard


                    
@article{osti_1457305,

  title        = {Context-Dependent Piano Music Transcription With Convolutional Sparse Coding},

  author       = {Cogliati, Andrea and Duan, Zhiyao and Wohlberg, Brendt},

  abstractNote = {This study presents a novel approach to automatic transcription of piano music in a context-dependent setting. This approach employs convolutional sparse coding to approximate the music waveform as the summation of piano note waveforms (dictionary elements) convolved with their temporal activations (onset transcription). The piano note waveforms are pre-recorded for the specific piano to be transcribed in the specific environment. During transcription, the note waveforms are fixed and their temporal activations are estimated and post-processed to obtain the pitch and onset transcription. This approach works in the time domain, models temporal evolution of piano notes, and estimates pitches and onsets simultaneously in the same framework. Finally, experiments show that it significantly outperforms a state-of-the-art music transcription method trained in the same context-dependent setting, in both transcription accuracy and time precision, in various scenarios including synthetic, anechoic, noisy, and reverberant environments.},

  doi          = {10.1109/TASLP.2016.2598305},

  journal      = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},

  number       = 12,

  volume       = 24,

  place        = {United States},

  year         = {Thu Aug 04 00:00:00 EDT 2016},

  month        = {Thu Aug 04 00:00:00 EDT 2016}

}

Copy to clipboard

Journal Article:

Free Publicly Available Full Text

Accepted Manuscript (DOE)

Publisher's Version of Record

https://doi.org/10.1109/TASLP.2016.2598305

Other availability

Search WorldCat to find libraries that may hold this journal

Citation Metrics:

Cited by: 18 works

Citation information provided by
Web of Science

Save / Share:

Export Metadata

Save to My Library

Works referencing / citing this record:

Deep learning-based automatic downbeat tracking: a brief review
journal, March 2019

Jia, Bijue; Lv, Jiancheng; Liu, Dayiheng
Multimedia Systems, Vol. 25, Issue 6
DOI: 10.1007/s00530-019-00607-x

Similar Records in DOE PAGES and OSTI.GOV collections:

Context-dependent piano music transcription with convolutional sparse coding

Patent Cogliati, Andrea ; Duan, Zhiyao ; Wohlberg, Brendt Egon

The present disclosure presents a novel approach to automatic transcription of piano music in a context-dependent setting. Embodiments described herein may employ an efficient algorithm for convolutional sparse coding to approximate a music waveform as a summation of piano note waveforms convolved with associated temporal activations. The piano note waveforms may be pre-recorded for a particular piano that is to be transcribed and may optionally be pre-recorded in the specific environment where the piano performance is to be performed. During transcription, the note waveforms may be fixed and associated temporal activations may be estimated and post-processed to obtain the pitchmore »« less
Full Text Available
Piano Transcription with Convolutional Sparse Lateral Inhibition

Journal Article Cogliati, Andrea ; Duan, Zhiyao ; Wohlberg, Brendt Egon - IEEE Signal Processing Letters

This paper extends our prior work on contextdependent piano transcription to estimate the length of the notes in addition to their pitch and onset. This approach employs convolutional sparse coding along with lateral inhibition constraints to approximate a musical signal as the sum of piano note waveforms (dictionary elements) convolved with their temporal activations. The waveforms are pre-recorded for the specific piano to be transcribed in the specific environment. A dictionary containing multiple waveforms per pitch is generated by truncating a long waveform for each pitch to different lengths. During transcription, the dictionary elements are fixed and their temporal activationsmore »« less
Cited by 9
https://doi.org/10.1109/LSP.2017.2666183

Full Text Available
Monaural Music Source Separation using Convolutional Sparse Coding

Journal Article Jao, Ping-Keng ; Su, Li ; Yang, Yi-Hsuan ; ... - IEEE/ACM Transactions on Audio, Speech, and Language Processing

We present a comprehensive performance study of a new time-domain approach for estimating the components of an observed monaural audio mixture. Unlike existing time-frequency approaches that use the product of a set of spectral templates and their corresponding activation patterns to approximate the spectrogram of the mixture, the proposed approach uses the sum of a set of convolutions of estimated activations with prelearned dictionary filters to approximate the audio mixture directly in the time domain. The approximation problem can be solved by an efficient convolutional sparse coding algorithm. The effectiveness of this approach for source separation of musical audio hasmore »« less
Cited by 10
https://doi.org/10.1109/TASLP.2016.2598323

Full Text Available
Microbial Bebop: Creating Music from Complex Dynamics in Microbial Ecology

Journal Article Larsen, Peter ; Gilbert, Jack - PLoS ONE

In order for society to make effective policy decisions on complex and far-reaching subjects, such as appropriate responses to global climate change, scientists must effectively communicate complex results to the non-scientifically specialized public. However, there are few ways however to transform highly complicated scientific data into formats that are engaging to the general community. Taking inspiration from patterns observed in nature and from some of the principles of jazz bebop improvisation, we have generated Microbial Bebop, a method by which microbial environmental data are transformed into music. Microbial Bebop uses meter, pitch, duration, and harmony to highlight the relationships betweenmore »« less
https://doi.org/10.1371/journal.pone.0058119

Full Text Available
Selective inhibition of adenovirus type 2 early region II and III transcription by an anisomycin block of protein synthesis

Journal Article Shaw, A R ; Ziff, E B - Mol. Cell. Biol.; (United States)

The transcription of adenovirus type 2 genes proceeds through a broad three-phase program. From 1 to 4 h postinfection six early transcription units (EIa, EIb, EII, EIII, EIV, and the promoter-proximal segment of the late transcription unit) are activated. From 4 to 6 h postinfection transcription of the early genes is depressed. After the onset of viral DNA replication at --6 h postinfection, the transcript from the late promoter is antiterminated, and this transcript dominates viral RNA synthesis. The early activation period also proceeds through a series of stages; early regions EIa and EIV are activated first, followed by earlymore »« less
https://doi.org/10.1128/MCB.2.7.789

Similar Records

Title: Context-Dependent Piano Music Transcription With Convolutional Sparse Coding

Abstract

Citation Formats

Deep learning-based automatic downbeat tracking: a brief review journal, March 2019

Deep learning-based automatic downbeat tracking: a brief review
journal, March 2019