Loading...
Please wait, while we are loading the content...
Creation of unseen triphones from diphones and monophones using a speech production approach.
| Content Provider | CiteSeerX |
|---|---|
| Author | Elenius, Kjell Blomberg, Mats |
| Abstract | With limited training data, infrequent triphone models for speech recognition will not be observed in sufficient number. In this report, a speech production approach is used to predict the characteristics of unseen triphones by concatenating diphones and/or monophones in the parametric representation of a formant speech synthesiser. The parameter trajectories are estimated by interpolation between the endpoints of the original units. The spectral states of the created triphone are generated by the speech synthesiser. Evaluation of the proposed technique has been performed using spectral error measurements and recognition candidate rescoring of N-best lists. In both cases, the created triphones are shown to perform better than the shorter units from which they were constructed. 1. INTRODUCTION The triphone unit is the basic phone model in many current phonetic speech recognition systems. The reason for this is that triphones capture the coarticulation effect caused by the immediate pr... |
| File Format | |
| Access Restriction | Open |
| Subject Keyword | Parametric Representation Triphone Unit Spectral State Speech Production Approach Coarticulation Effect Formant Speech Synthesiser Spectral Error Measurement Immediate Pr Original Unit Limited Training Data N-best List Parameter Trajectory Speech Synthesiser Monophones Using Speech Production Approach Speech Recognition Unseen Triphones Recognition Candidate Rescoring Infrequent Triphone Model Basic Phone Model Unseen Triphones Diphones Sufficient Number |
| Content Type | Text |