Loading...
Please wait, while we are loading the content...
Similar Documents
Proceedings of The Second Workshop on Annotation and Exploitation of Parallel Corpora associated with The 8 th International Conference on Recent Advances in Natural Language Processing ( RANLP
| Content Provider | Semantic Scholar |
|---|---|
| Author | Smedt, Koenraad De |
| Copyright Year | 2011 |
| Abstract | Recent developments in statistical machine translation (SMT), e.g., the availability of efficient implementations of integrated open-source toolkits like Moses, have made it possible to build a prototype system with decent translation quality for any language pair in a few days or even hours. This is so in theory. In practice, doing so requires having a large set of parallel sentence-aligned bilingual texts (a bi-text) for that language pair, which is often unavailable. Large high-quality bi-texts are rare; except for Arabic, Chinese, and some official languages of the European Union (EU), most of the 6,500+ world languages remain resourcepoor from an SMT viewpoint. This number is even more striking if we consider language pairs instead of individual languages, e.g., while Arabic and Chinese are among the most resource-rich languages for SMT, the Arabic-Chinese language pair is quite resource-poor. Moreover, even resourcerich language pairs could be poor in bi-texts for a specific domain, e.g., biomedical text, conversa- |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.bultreebank.org/AEPC2/Proceedings-AEPC2011.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Proceeding |