Loading...
Please wait, while we are loading the content...
Similar Documents
Towards Scalable Data-Driven Authorship Attribution
| Content Provider | Semantic Scholar |
|---|---|
| Author | Cartright, Marc-Allen Bendersky, Michael |
| Copyright Year | 2008 |
| Abstract | Traditional authorship attribution approaches have made attempts at capturing features that were designed heuristically – researchers guessed at which aspects of language would best separate one author from another and then performed experiments to see how valid their assumptions were. While this approach has met some success, it also proves to be unscalable – most test collections to date have been on the size of 10 or less authors, which in the age of internet-style publication is an unrealistically low quantity. We believe that this approach to feature selection for authorship attribution adds unnecessary complexity to what the task really seems to be: a multiclass classification problem, and one where the most useful features can be easily discovered using a standard dimensionality reduction technique. We demonstrate the use of such a technique to dramatically reduce the number of used features for authorship attribution using an implementation of Support Vector Machines. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://ciir-publications.cs.umass.edu/pub/web/getpdf.php?id=829 |
| Alternate Webpage(s) | http://ciir.cs.umass.edu/~bemike/pubs/2008-4.pdf |
| Alternate Webpage(s) | http://maroo.cs.umass.edu/pub/web/getpdf.php?id=829 |
| Alternate Webpage(s) | http://bendersky.github.io/pubs/2008-4.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |