Loading...
Please wait, while we are loading the content...
Similar Documents
Wikipedia Mining Wikipedia as a Corpus for Knowledge Extraction
| Content Provider | Semantic Scholar |
|---|---|
| Author | Nakayama, Kotaro Pei, Minghua Erdmann, Maike Ito, Masahiro Shirakawa, Masumi Hara, Takahiro Nishio, Shojiro |
| Copyright Year | 2008 |
| Abstract | Wikipedia, a collaborative Wiki-based encyclopedia, has become a huge phenomenon among Internet users. It covers a huge number of concepts of various fields such as Arts, Geography, History, Science, Sports and Games. As a corpus for knowledge extraction, Wikipedia’s impressive characteristics are not limited to the scale, but also include the dense link structure, word sense disambiguation based on URL and brief anchor texts. Because of these characteristics, Wikipedia has become a promising corpus and a big frontier for researchers. A considerable number of researches on Wikipedia Mining such as semantic relatedness measurement, bilingual dictionary construction, and ontology construction have been conducted. In this paper, we take a comprehensive, panoramic view of Wikipedia as a Web corpus since almost all previous researches are just exploiting parts of the Wikipedia characteristics. The contribution of this paper is triple-sum. First, we unveil the characteristics of Wikipedia as a corpus for knowledge extraction in detail. In particular, we describe the importance of anchor texts with special emphasis since it is helpful information for both disambiguation and synonym extraction. Second, we introduce some of our Wikipedia mining researches as well as researches conducted by other researches in order to prove the worth of Wikipedia. Finally, we discuss possible directions of Wikipedia research. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://wikipedia-lab.org/en/images/0/06/Wikimania2008.pdf |
| Alternate Webpage(s) | http://ymatsuo.com/papers/wikimania08nakayama.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |