Loading...
Please wait, while we are loading the content...
Similar Documents
Employing em in pool-based active learning for text classification (1998).
| Content Provider | CiteSeerX |
|---|---|
| Author | Mccallum, Andrew Nigam, Kamal |
| Abstract | This paper shows how a text classifier's need for labeled training data can be reduced by a combination of active learning and Expectation Maximization (EM) on a pool of unlabeled data. Query-by-Committee is used to actively select documents for labeling, then EM with a naive Bayes model further improves classification accuracy by concurrently estimating probabilistic labels for the remaining unlabeled documents and using them to improve the model. We also present a metric for better measuring disagreement among committee members; it accounts for the strength of their disagreement and for the distribution of the documents. Experimental results show that our method of combining EM and active learning requires only half as many labeled training examples to achieve the same accuracy as either EM or active learning alone. Keywords: text classification active learning unsupervised learning information retrieval 1 Introduction In many settings for learning text classifiers, obtaining lab... |
| File Format | |
| Publisher Date | 1998-01-01 |
| Access Restriction | Open |
| Subject Keyword | Select Document Pool-based Active Learning Probabilistic Label Text Classification Active Learning Committee Member Naive Bayes Model Active Learning Training Example Unlabeled Data Classification Accuracy Expectation Maximization Text Classification Unlabeled Document Many Setting Experimental Result Information Retrieval Labeled Training Data Measuring Disagreement Text Classifier |
| Content Type | Text |
| Resource Type | Article |