Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Speech Silicon: An FPGA Architecture for Real-Time Hidden Markov-Model-Based Speech Recognition | Hindawi Publishing Corporation EURASIP Journal on Embedded Systems Volume 2006 Article ID 48085 Pages 1-19 DOI ES 2006 48085 Speech Silicon An FPGA Architecture for Real-Time Hidden Markov-Model-Based Speech Recognition Jeffrey Schuster Kshitij Gupta Raymond Hoare and Alex K. Jones University of Pittsburgh Pittsburgh PA 15261 USA Received 21 December 2005 Revised 8 June 2006 Accepted 27 June 2006 This paper examines the design of an FPGA-based system-on-a-chip capable of performing continuous speech recognition on medium-sized vocabularies in real time. Through the creation of three dedicated pipelines one for each of the major operations in the system we were able to maximize the throughput of the system while simultaneously minimizing the number of pipeline stalls in the system. Further by implementing a token-passing scheme between the later stages of the system the complexity of the control was greatly reduced and the amount of active data present in the system at any time was minimized. Additionally through in-depth analysis of the SPHINX 3 large vocabulary continuous speech recognition engine we were able to design models that could be efficiently benchmarked against a known software platform. These results combined with the ability to reprogram the system for different recognition tasks serve to create a system capable of performing real-time speech recognition in a vast array of environments. Copyright 2006 Jeffrey Schuster et al. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited. 1. INTRODUCTION Many of today s state-of-the-art software systems rely on the use of hidden Markov model HMM evaluations to calculate the probability that a particular audio sample is representative of a particular sound within a particular word 1 2 . Such systems have been observed to achieve accuracy rates upwards