Loading...
Please wait, while we are loading the content...
Similar Documents
Corpus Linguistics for the Annotation Manager
| Content Provider | Hyper Articles en Ligne (HAL) |
|---|---|
| Author | Fort, Karën Nazarenko, Adeline Claire, Ris |
| Abstract | Hand crafted annotated corpora are acknowledged as critical elements for the Human Language Technologies but systems have to be trained on domain specific data to achieve a high level of performance. This is the reason why numerous annotation campaigns are launched. The role of the annotation manager consists in designing the annotation protocol, sometimes selecting the source data, hiring the required number of annotators with the adequate competences, writing the annotation guidelines, controlling the annotation process and delivering the resulting annotated corpus with the expected quality. However, for a given task, the complexity of the annotation work seems to be highly dependent on the type of corpus to annotate. Since this affects both the cost and the quality of the annotation, it is an important issue to tackle for the annotation manager. This paper illustrates the role of corpus linguistics for the management of annotations through a specific annotation campaign. We show how the corpus characteristics affect all aspects of the annotation protocol: the design of the annotation guidelines, the selection of the a sub-corpus for training, the duration of the annotator's training, the complexity of the annotation formalism, the quality of the resulting annotation. |
| File Format | |
| Language | English |
| Publisher Date | 2011-11-30 |
| Access Restriction | Open |
| Subject Keyword | corpus linguistics annotation annotation management info Computer Science [cs] Document and Text Processing |
| Content Type | Text |
| Resource Type | Article |