Loading...
Please wait, while we are loading the content...
Similar Documents
Improvements that don’t add up: Ad-hoc retrieval results since (1998)
| Content Provider | CiteSeerX |
|---|---|
| Author | Armstrong, Timothy G. Moffat, Alistair Webber, William Zobel, Justin |
| Description | The existence and use of standard test collections in information retrieval experimentation allows results to be compared between research groups and over time. Such comparisons, however, are rarely made. Most researchers only report results from their own experiments, a practice that allows lack of overall improvement to go unnoticed. In this paper, we analyze results achieved on the TREC Ad-Hoc, Web, Terabyte, and Robust collections as reported in SIGIR (1998–2008) and CIKM (2004–2008). Dozens of individual published experiments report effectiveness improvements, and often claim statistical significance. However, there is little evidence of improvement in ad-hoc retrieval technology over the past decade. Baselines are generally weak, often being below the median original TREC system. And in only a handful of experiments is the score of the best TREC automatic run exceeded. Given this finding, we question the value of achieving even a statistically significant result over a weak baseline. We propose that the community adopt a practice of regular longitudinal comparison to ensure measurable progress, or at least prevent the lack of it from going unnoticed. We describe an online database of retrieval runs that facilitates such a practice. Proc. CIKM |
| File Format | |
| Language | English |
| Publisher Date | 1998-01-01 |
| Access Restriction | Open |
| Subject Keyword | Claim Statistical Significance Significant Result Ad-hoc Retrieval Technology Overall Improvement Weak Baseline Standard Test Collection Median Original Trec System Regular Longitudinal Comparison Experiment Report Effectiveness Improvement Trec Ad-hoc Little Evidence Retrieval Run Past Decade Ad-hoc Retrieval Result Robust Collection Measurable Progress Trec Automatic Run Research Group Online Database Information Retrieval Experimentation |
| Content Type | Text |
| Resource Type | Article |