Loading...
Please wait, while we are loading the content...
Similar Documents
A vector space model for information retrieval with generalized similarity measures.
| Content Provider | Semantic Scholar |
|---|---|
| Author | Paskaleva, Biliana S. Bochev, Pavel B. |
| Copyright Year | 2012 |
| Abstract | We develop a new set of similarity functions for a formal vector space model for information retrieval. Our model considers records as multisets of tokens. A token weight maps records into real vectors. Using this vector representation we define a p-norm of a record and pairwise conjunction and disjunction operations on records. With the help of these operations we then develop consistent extensions of set-based similarity functions and new `p distance-based similarities. We show that with particular classes of token weights and p-values, our definitions recover the standard versions of the similarity functions. The paper concludes with a preliminary study of the new similarities in the context of a model entity matching problem. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | https://cfwebprod.sandia.gov/cfdocs/CompResearch/docs/CIKM-Paper.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |