Loading...
Please wait, while we are loading the content...
Similar Documents
Performance Evaluation of Deep Neural Networks Applied to Speech Recognition: RNN, LSTM and GRU
| Content Provider | Semantic Scholar |
|---|---|
| Author | Shewalkar, Apeksha Nagesh Nyavanandi, Deepika Ludwig, Simone A. |
| Copyright Year | 2019 |
| Abstract | Abstract Deep Neural Networks (DNN) are nothing but neural networks with many hidden layers. DNNs are becoming popular in automatic speech recognition tasks which combines a good acoustic with a language model. Standard feedforward neural networks cannot handle speech data well since they do not have a way to feed information from a later layer back to an earlier layer. Thus, Recurrent Neural Networks (RNNs) have been introduced to take temporal dependencies into account. However, the shortcoming of RNNs is that long-term dependencies due to the vanishing/exploding gradient problem cannot be handled. Therefore, Long Short-Term Memory (LSTM) networks were introduced, which are a special case of RNNs, that takes long-term dependencies in a speech in addition to short-term dependencies into account. Similarily, GRU (Gated Recurrent Unit) networks are an improvement of LSTM networks also taking long-term dependencies into consideration. Thus, in this paper, we evaluate RNN, LSTM, and GRU to compare their performances on a reduced TED-LIUM speech data set. The results show that LSTM achieves the best word error rates, however, the GRU optimization is faster while achieving word error rates close to LSTM. |
| Starting Page | 235 |
| Ending Page | 245 |
| Page Count | 11 |
| File Format | PDF HTM / HTML |
| DOI | 10.2478/jaiscr-2019-0006 |
| Volume Number | 9 |
| Alternate Webpage(s) | http://cs.ndsu.edu/~siludwig/Publish/papers/JAISCR20192.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |