Luận án này đã giải quyết các câu hỏi của các mạng thần kinh có thể phục vụ như một nền tảng hữu ích cho một từ vựng lớn, loa độc lập, hệ thống nhận dạng tiếng nói liên tục. Chúng tôi đã thành công trong việc cho thấy rằng thực sự họ có thể, khi các mạng thần kinh được sử dụng cẩn thận và chu đáo. | 9. Conclusions This dissertation has addressed the question of whether neural networks can serve as a useful foundation for a large vocabulary speaker independent continuous speech recognition system. We succeeded in showing that indeed they can when the neural networks are used carefully and thoughtfully. . Neural Networks as Acoustic Models A speech recognition system requires solutions to the problems of both acoustic modeling and temporal modeling. The prevailing speech recognition technology Hidden Markov Models offers solutions to both of these problems acoustic modeling is provided by discrete continuous or semicontinuous density models and temporal modeling is provided by states connected by transitions arranged into a strict hierarchy of phonemes words and sentences. While an HMM s solutions are effective they suffer from a number of drawbacks. Specifically the acoustic models suffer from quantization errors and or poor parametric modeling assumptions the standard Maximum Likelihood training criterion leads to poor discrimination between the acoustic models the Independence Assumption makes it hard to exploit multiple input frames and the First-Order Assumption makes it hard to model coarticulation and duration. Given that HMMs have so many drawbacks it makes sense to consider alternative solutions. Neural networks well known for their ability to learn complex functions generalize effectively tolerate noise and support parallelism offer a promising alternative. However while today s neural networks can readily be applied to static or temporally localized pattern recognition tasks we do not yet clearly understand how to apply them to dynamic temporally extended pattern recognition tasks. Therefore in a speech recognition system it currently makes sense to use neural networks for acoustic modeling but not for temporal modeling. Based on these considerations we have investigated hybrid NN-HMM systems in which neural networks are responsible for acoustic .