NDLI: Approximate matching of textual domain attributes for information source integration

Please wait, while we are loading the content...

Content Provider	ACM Digital Library
Author	Koeller, Andreas Keelara, Vinay
Abstract	A key problem in the integration of information sources is the identification of related attributes or objects across independent sources. Inferring such meta-information from source data (rather than a-priori available meta-data, such as attribute names) is sometimes possible. For example, existing algorithms attempt to integrate information sources by finding patterns such as Inclusion Dependencies (INDs) across them. However, INDs are based on exact set inclusion and are thus very strict patterns that rarely hold across independent real-world databases.We propose two error-tolerant measures, termed Similarity Score and Distribution Score, that help identify related attributes across two independent databases, based on similarities in their data. Those measures specifically address the problem of identifying semantic relationships between textual attributes of databases that have few or no equal values.We also present implementations of those measures and some experimental results.
Starting Page	77
Ending Page	86
Page Count	10
File Format	PDF
ISBN	1595931600
DOI	10.1145/1077501.1077516
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	2005-06-17
Publisher Place	New York
Access Restriction	Subscribed
Content Type	Text
Resource Type	Article

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in

Approximate matching of textual domain attributes for information source integration