Loading...
Please wait, while we are loading the content...
Similar Documents
On Using Statistical Semantic on Domain Specific Information Retrieval
| Content Provider | Semantic Scholar |
|---|---|
| Author | Rekabsaz, Navid Thesis, M. A. S. T. E. R’ S. Arbeit, Verfassung Der |
| Copyright Year | 2015 |
| Abstract | Information retrieval must move from a pure surface-based point-of-view to a conceptual pointof-view that matches the contents on a semantic level. Exploring the opportunities offered by statistical semantics, we revisit text-based retrieval on two very different domains (social image as well as patent retrieval) in order to provide a comparative analysis of the efficiency and effectiveness of the analyzed methods. Our semantic-based retrieval approach consists of two elements: first, the methods to create the semantic representations of the terms and second, the approaches to measure the conceptual-based similarity of the texts. For term representations, we use Word2Vec, a state-of-the-art approach based on deep learning, as well as Random Indexing, a more straightforward but effective count-based method. Reviewing the literature, we also select two text similarity methods: one directly measuring similarity at document level (SimAgg) and the other considering the similarity of two documents as a linear combination of the relatedness of their terms (SimGreedy). We assess the performance and limitations of the mentioned methods, by comparing them to the state-of-the-art text search engines. On both the domains, our semantic retrieval methods show a statistically significant improvement in comparison to a best practice term-frequencybased search engine, at the expense of a significant increase in processing time. To address the time-complexity problem of semantic-based methods, we also focus on optimization to enable larger and more real-world style applications. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | https://publik.tuwien.ac.at/files/PubDat_245608.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |