Loading...
Please wait, while we are loading the content...
Similar Documents
Automatic rich annotation of large corpus of conversational transcribed speech : the chunking task of the EPAC project (2008)
| Content Provider | CiteSeerX |
|---|---|
| Author | Antoine, Jean-Yves Mokrane, Abdenour Friburger, Nathalie |
| Description | This paper describes the use of the CasSys platform in order to achieve the chunking of conversational speech transcripts by means of cascades of Unitex transducers. Our system is involved in the EPAC project of the French National agency of Research (ANR). The aim of this project is to develop robust methods for the annotation of audio/multimedia document collections which contains conversational speech sequences such as TV or radio programs. At first, this paper presents the EPAC project and the adaptation of a former chunking system (Romus) which was developed in the restricted framework of dedicated spoken man-machine dialogue. Then, it describes the problems that are arising due to 1) spontaneous speech disfluencies and 2) errors for the previous stages of processing (automatic speech recognition and POS tagging). |
| File Format | |
| Language | English |
| Publisher Date | 2008-01-01 |
| Publisher Institution | Proceedings of the 6 th European Conference on Language Resources and Evaluation (LREC). Marrakech, Maroc Université François Rabelais |
| Access Restriction | Open |
| Subject Keyword | Cassys Platform Automatic Speech Recognition Dedicated Spoken Man-machine Dialogue Robust Method Unitex Transducer Former Chunking System Radio Program Spontaneous Speech Disfluency Large Corpus Audio Multimedia Document Collection Epac Project Conversational Speech Sequence French National Agency Previous Stage Restricted Framework Automatic Rich Annotation Conversational Transcribed Speech Conversational Speech Transcript |
| Content Type | Text |
| Resource Type | Article |