Loading...
Please wait, while we are loading the content...
Similar Documents
Estimating the Number of Unseen Species: How Many Words did Shakespeare Know?
| Content Provider | Semantic Scholar |
|---|---|
| Author | McCullagh, Peter |
| Copyright Year | 2008 |
| Abstract | This paper is the first of two written by Brad Efron and Ron Thisted studying the frequency distribution of words in the Shakespearean canon. The key idea due to Fisher in the context of sampling of species is simple and elegant. When applied to Shakespeare the idea appears to be preposterous: an author has a personal vocabulary of word species represented by a distribution G, and text is generated by sampling from this distribution. Most results do not require successive words to be sampled independently, which leaves room for individual style and context, but stationarity is needed for prediction and inference. The expected number of words that occur x ≥ 1 times in a large sample of n words is |
| Starting Page | 104 |
| Ending Page | 118 |
| Page Count | 15 |
| File Format | PDF HTM / HTML |
| DOI | 10.1007/978-0-387-75692-9_5 |
| Alternate Webpage(s) | http://www.stat.uchicago.edu/~pmcc/reports/efron.pdf |
| Alternate Webpage(s) | https://doi.org/10.1007/978-0-387-75692-9_5 |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |