Loading...
Please wait, while we are loading the content...
Similar Documents
Hierarchical search for large vocabulary conversational speech recognition (1999)
Content Provider | CiteSeerX |
---|---|
Author | Deshmukh, Neeraj Ganapathiraju, Aravind Picone, Joseph |
Abstract | ABSTRACT 2 Speaker-independent speech recognition technology has made significant progress from the days of isolated word recognition. Today, state-of-the-art systems are capable of performing large vocabulary continuous speech recognition (LVCSR) on audio streams derived from complex information sources such as broadcast news and two-way telephone dialogs. A significant contribution to this advancement in technology is the development of search techniques that find suboptimal but accurate solutions in problems involving large search spaces and extremely complex statistical models. Moreover, these search strategies are capable of dynamically integrating information from a number of diverse knowledge sources to determine the correct word hypothesis, and limit the scope of the search by using a hierarchical search strategy. We refer to this problem as the decoding or search problem. This paper describes the complexity associated with decoding using hierarchical representations for linguistic and acoustic knowledge sources. An extensible object-oriented decoder available in the public domain, that leverages current state-of-the-art technology is described to illustrate these concepts. This decoder supports efficient handling of acoustic models for cross-word contextdependent phones, multiple pronunciations of words using lexical trees, and rescoring of word graphs based on N-gram language models in a single pass. It employs a state-of-the-art Viterbistyle dynamic programming algorithm, and is equipped with several heuristic pruning criteria to minimize the consumption of computational resources while maintaining good accuracy. |
File Format | |
Volume Number | 16 |
Journal | IEEE Signal Processing Magazine |
Language | English |
Publisher Date | 1999-01-01 |
Access Restriction | Open |
Subject Keyword | Large Vocabulary Conversational Speech Recognition Hierarchical Search Acoustic Knowledge Source Computational Resource State-of-the-art Viterbistyle Dynamic Search Strategy N-gram Language Model Acoustic Model Two-way Telephone Dialog Diverse Knowledge Source Significant Contribution Single Pas Good Accuracy Speaker-independent Speech Recognition Technology Word Graph State-of-the-art System Correct Word Hypothesis Public Domain Lexical Tree Search Problem Broadcast News Complex Information Source Multiple Pronunciation Large Search Space Isolated Word Recognition Current State-of-the-art Technology Search Technique Cross-word Contextdependent Phone Accurate Solution Significant Progress Hierarchical Search Strategy Efficient Handling Complex Statistical Model Large Vocabulary Continuous Speech Recognition Audio Stream Extensible Object-oriented Decoder Hierarchical Representation |
Content Type | Text |
Resource Type | Article |