Loading...
Please wait, while we are loading the content...
Similar Documents
On the applicability of readability models to web texts.
| Content Provider | CiteSeerX |
|---|---|
| Author | Vajjala, Sowmya Meurers, Detmar |
| Description | This content is published in/by Universität Tübingen |
| Abstract | An increasing range of features is being used for automatic readability classification. The impact of the features typically is evaluated using reference corpora containing graded reading material. But how do the readability models and the features they are based on perform on real-world web texts? In this paper, we want to take a step towards understanding this aspect on the basis of a broad range of lexical and syntactic features and several web datasets we collected. Applying our models to web search results, we find that the average reading level of the retrieved web documents is relatively high. At the same time, documents at a wide range of reading levels are identified and even among the Top-10 search results one finds documents at the lower levels, supporting the potential usefulness of readability ranking for the web. Finally, we report on generalization experiments showing that the features we used generalize well across different web sources. 1 |
| File Format | |
| Access Restriction | Open |
| Subject Keyword | Readability Model Web Text Broad Range Generalization Experiment Average Reading Level Syntactic Feature Top-10 Search Result Readability Ranking Real-world Web Text Potential Usefulness Different Web Source Several Web Datasets Search Result Retrieved Web Document Wide Range Automatic Readability Classification Reading Level Reference Corpus |
| Content Type | Text |