Loading...
Please wait, while we are loading the content...
Similar Documents
Little words can make a big difference for text classification.
| Content Provider | CiteSeerX |
|---|---|
| Abstract | Most information retrieval systems use stopword lists and stemming algorithms. However, we have found that recognizing singular and plural nouns, verb forms, negation, and prepositions can produce dramatically different text classification results. We present results from text classification experiments that compare relevancy signatures, which use local linguistic context, with corresponding indexing terms that do not. In two different domains, relevancy signatures produced better results than the simple indexing terms. These experiments suggest that stopword lists and stemming algorithms may remove or conflate many words that could be used to create more effective indexing terms. |
| File Format | |
| Access Restriction | Open |
| Subject Keyword | Little Word Different Domain Plural Noun Simple Indexing Term Different Text Classification Result Effective Indexing Term Big Difference Many Word Local Linguistic Context Text Classification Information Retrieval System Stopword List Verb Form Present Result Text Classification Experiment Relevancy Signature |
| Content Type | Text |