Loading...
Please wait, while we are loading the content...
Similar Documents
Effective and scalable solutions for mixed and split citation problems in digital libraries (2005)
| Content Provider | CiteSeerX |
|---|---|
| Author | Lee, Dongwon On, Byung-Won Kang, Jaewoo Park, Sanghyun |
| Description | In this paper, we consider two important problems that commonly occur in bibliographic digital libraries, which seriously degrade their data qualities: Mixed Citation (MC) problem (i.e., citations of different scholars with their names being homonyms are mixed together) and Split Citation (SC) problem (i.e., citations of the same author appear under different name variants). In particular, we investigate an effective yet scalable solution since citations in such digital libraries tend to be large-scale. After formally defining the problems and accompanying challenges, we present an effective solution that is based on the state-of-the-art sampling-based approximate join algorithm. Our claim is verified through preliminary experimental results. INFORMATION QUALITY IN INFORMATIONAL SYSTEMS |
| File Format | |
| Language | English |
| Publisher | ACM |
| Publisher Date | 2005-01-01 |
| Access Restriction | Open |
| Subject Keyword | Split Citation Problem Mixed Citation Different Scholar Scalable Solution Split Citation Different Name Variant Preliminary Experimental Result State-of-the-art Sampling-based Approximate Join Algorithm Important Problem Bibliographic Digital Library Data Quality Effective Solution Digital Library |
| Content Type | Text |
| Resource Type | Article |