skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Speech recognition systems on the Cell Broadband Engine

Abstract

In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousands of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.

Authors:
; ; ; ; ;
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
940892
Report Number(s):
UCRL-JRNL-230195
Journal ID: ISSN 0018-8646; IBMJAE; TRN: US200824%%388
DOE Contract Number:
W-7405-ENG-48
Resource Type:
Journal Article
Resource Relation:
Journal Name: IBM Journal of Research and Development, vol. 51, no. 5, August 11, 2007, pp. 583-592; Journal Volume: 51; Journal Issue: 5
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS; ACCELERATION; CAPACITY; DESIGN; ENGINES; IMPLEMENTATION; PERFORMANCE; PIPELINES; PROCESSING; PRODUCTION; SPEECH

Citation Formats

Liu, Y, Jones, H, Vaidya, S, Perrone, M, Tydlitat, B, and Nanda, A. Speech recognition systems on the Cell Broadband Engine. United States: N. p., 2007. Web. doi:10.1147/rd.515.0583.
Liu, Y, Jones, H, Vaidya, S, Perrone, M, Tydlitat, B, & Nanda, A. Speech recognition systems on the Cell Broadband Engine. United States. doi:10.1147/rd.515.0583.
Liu, Y, Jones, H, Vaidya, S, Perrone, M, Tydlitat, B, and Nanda, A. Fri . "Speech recognition systems on the Cell Broadband Engine". United States. doi:10.1147/rd.515.0583. https://www.osti.gov/servlets/purl/940892.
@article{osti_940892,
title = {Speech recognition systems on the Cell Broadband Engine},
author = {Liu, Y and Jones, H and Vaidya, S and Perrone, M and Tydlitat, B and Nanda, A},
abstractNote = {In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousands of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.},
doi = {10.1147/rd.515.0583},
journal = {IBM Journal of Research and Development, vol. 51, no. 5, August 11, 2007, pp. 583-592},
number = 5,
volume = 51,
place = {United States},
year = {Fri Apr 20 00:00:00 EDT 2007},
month = {Fri Apr 20 00:00:00 EDT 2007}
}
  • This article discusses the automatic processing of speech signals with the aim of finding a sequence of works (speech recognition) or a concept (speech understanding) being transmitted by the speech signal. The goal of the research is to develop an automatic typewriter that will automatically edit and type text under voice control. A dynamic programming method is proposed in which all possible class signals are stored, after which the presented signal is compared to all the stored signals during the recognition phase. Topics considered include element-by-element recognition of words of speech, learning speech recognition, phoneme-by-phoneme speech recognition, the recognition ofmore » connected speech, understanding connected speech, and prospects for designing speech recognition and understanding systems. An application of the composition dynamic programming method for the solution of basic problems in the recognition and understanding of speech is presented.« less
  • A hidden-Markov-model (HMM) based speech recognition system was evaluated that makes use of simultaneously recorded acoustic and articulatory data. The articulatory measurements were gathered by means of electromagnetic articulography and describe the movement of small coils fixed to the speakers` tongue and jaw during the production of German V{sub 1}CV{sub 2} sequences [P. Hoole and S. Gfoerer, J. Acoust. Soc. Am. Suppl. 1 {bold 87}, S123 (1990)]. Using the coordinates of the coil positions as an articulatory representation, acoustic and articulatory features were combined to make up an acoustic--articulatory feature vector. The discriminant power of this combined representation was evaluatedmore » for two subjects on a speaker-dependent isolated word recognition task. When the articulatory measurements were used both for training and testing the HMMs, the articulatory representation was capable of reducing the error rate of comparable acoustic-based HMMs by a relative percentage of more than 60%. In a separate experiment, the articulatory movements during the testing phase were estimated using a multilayer perceptron that performed an acoustic-to-articulatory mapping. Under these more realistic conditions, when articulatory measurements are only available during the training, the error rate could be reduced by a relative percentage of 18% to 25%.« less
  • This paper shows how a semiautomatic design of a speech recognition system can be done as a planning activity. Recognition performances are used for deciding plan refinement. Inductive learning is performed for setting action preconditions. Experimental results in the recognition of connected letters spoken by 100 speakers are presented.
  • This paper describes a single-board implementation of an isolated word recognizer based on the principles of linear predictive coding (LPC) and dynamic time warping (DTW). The recognizer requires only a serial (RS-232) terminal, power supply, and microphone for operation, and may be used to add speech input capability to any serial terminal connected to a host computer. Key elements of the recognizer include a custom integrated circuit for DTW-based pattern matching, a single-chip implementation of real-time LPC feature measurement, and a 16-bit microprocessor for control, communication, and decision functions. As a result of the custom integrated circuit and multiple processormore » architecture, pattern matching speed is increased by a factor of 50 over an earlier design with no custom integrated circuits and without pipeline processing capabilities, and proceeds on one word while LPC measurement on the next is in progress, increasing speech throughput.« less
  • In vitro primary antibody responses to limiting concentrations of trinitrophenyl (TNP)-Ficoll were shown to be T cell dependent, requiring the cooperation of T helper (TH) cells, B cells, and accessory cells. Under these conditions, TH cells derived from long-term radiation bone marrow chimeras were major histocompatibility complex (MHC) restricted in their ability to cooperate with accessory cells expressing host-type MHC determinants. The requirement for MHC-restricted self-recognition by TNP-Ficoll-reactive B cells was assessed under these T-dependent conditions. In the presence of competent TH cells, chimeric B cells were found to be MHC restricted, cooperating only with accessory cells that expressed host-typemore » MHC products. In contrast, the soluble products of certain monoclonal T cell lines were able to directly activate B cells in response to TNP-Ficoll, bypassing any requirement for MHC-restricted self-recognition. These findings demonstrate the existence of a novel cell interaction pathway in which B cells as well as TH cells are each required to recognize self-MHC determinants on accessory cells, but are not required to recognize each other. They further demonstrate that the requirement for self-recognition by B cells may be bypassed in certain T-dependent activation pathways.« less