NDLI: Algorithmic detection of semantic similarity

Please wait, while we are loading the content...

Semantic similarity between search engine queries using temporal correlation

GlobeDB: autonomic data replication for web applications

Fully automatic wrapper generation for search engines

Ranking a stream of news

A service creation environment based on end to end composition of Web services

Building adaptable and reusable XML applications with model transformations

Learning domain ontologies for Web service descriptions: an experiment in bioinformatics

Shared lexicon for distributed annotations on the Web

Improving Web search efficiency via a locality based static pruning method

Sub-document queries over XML with XSQirrel

eBag: a ubiquitous Web infrastructure for nomadic learning

Topic segmentation of message hierarchies for indexing and navigation support

Accessibility: a Web engineering approach

CubeSVD: a novel approach to personalized Web search

An abuse-free fair contract signing protocol based on the RSA signature

A search engine for natural language applications

A convenient method for securely managing passwords

ATMEN: a triggered network measurement infrastructure

On optimal service selection

PageRank as a function of the damping factor

Information search and re-access strategies of experienced web users

Named graphs, provenance and trust

Scaling link-based similarity search

Incremental maintenance for materialized XPath/XSLT views

CaTTS: calendar types and constraints for Web applications

A multi-threaded PIPELINED Web server architecture for SMP/SoC machines

WWW at 15 years: looking forward

Duplicate detection in click streams

Hierarchical substring caching for efficient content distribution to low-bandwidth clients

Web data extraction based on partial tree alignment

Algorithmic detection of semantic similarity

Ensuring required failure atomicity of composite Web services

Exception handling in workflow-driven Web applications

Making RDF presentable: integrated global and local semantic Web browsing

Using XForms to simplify Web programming

Sampling search-engine results

XJ: facilitating XML processing in Java

Online curriculum on the semantic Web: the CSD-UoC portal for peer-to-peer e-learning

Gimme' the context: context-driven automatic semantic annotation with C-PANKOW

A multilingual usage consultation tool based on internet searching: more than a search engine, less than QA

Automatic identification of user goals in Web search

TrustGuard: countering vulnerabilities in reputation management for decentralized overlay networks

An enhanced model for searching in semantic portals

Improving understanding of website privacy policies with fine-grained policy anchors

On the lack of typical behavior in the global Web traffic network

G-ToPSS: fast filtering of graph-based metadata

Object-level ranking: bringing order to Web objects

Browsing fatigue in handhelds: semantic bookmarking spells relief

OWL DL vs. OWL flight: conceptual modeling and reasoning for the semantic Web

LSH forest: self-tuning indexes for similarity search

Compiling XSLT 2.0 into XQuery 1.0

Expressiveness of XSDs: from practice to theory, there and back again

Cataclysm: policing extreme overloads in internet applications

The case for technology for developing regions

Improving recommendation lists through topic diversification

Executing incoherency bounded continuous queries at web data aggregators

Thresher: automating the unwrapping of semantic content from the World Wide Web

SemRank: ranking complex relationship search results on the semantic web

Web service interfaces

AwareDAV: a generic WebDAV notification framework and implementation

Web-assisted annotation, semantic indexing and search of television and radio news

Three-level caching for efficient query processing in large Web search engines

XQuery containment in presence of variable binding dependencies

The classroom sentinel: supporting data-driven decision-making in the classroom

Opinion observer: analyzing and comparing opinions on the Web

Improving portlet interoperability through deep annotation

User-centric Web crawling

Static approximation of dynamically generated Web pages

Disambiguating Web appearances of people in a social network

Hardening Web browsers against man-in-the-middle and eavesdropping attacks

Analysis of multimedia workloads with implications for internet streaming

Automating metadata generation: the simple indexing interface

A uniform approach to accelerated PageRank computation

WebPod: persistent Web browsing sessions with pocketable storage devices

Debugging OWL ontologies

Partitioning of Web graphs by community topology

An adaptive, fast, and safe XML parser based on byte sequences memorization

WEESA: Web engineering for semantic Web applications

Design for verification for asynchronously communicating Web services

Innovation for a human-centered network: NTT's R&D activities for achieving the NTT group's medium-term management strategy

Towards usable Web privacy and security

Real and the future of digital media

Algorithmic detection of semantic similarity

Content Provider	ACM Digital Library
Author	Roinestad, Heather Vespignani, Alessandro Menczer, Filippo Maguitman, Ana G.
Abstract	Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic measures is limited by the coverage of user studies, which do not scale with the size, heterogeneity, and growth of the Web. Here we propose to leverage human-generated metadata --- namely topical directories --- to measure semantic relationships among massive numbers of pairs of Web pages or topics. The Open Directory Project classifies millions of URLs in a topical ontology, providing a rich source from which semantic relationships between Web pages can be derived. While semantic similarity measures based on taxonomies (trees) are well studied, the design of well-founded similarity measures for objects stored in the nodes of arbitrary ontologies (graphs) is an open problem. This paper defines an information-theoretic measure of semantic similarity that exploits both the hierarchical and non-hierarchical structure of an ontology. An experimental study shows that this measure improves significantly on the traditional taxonomy-based approach. This novel measure allows us to address the general question of how text and link analyses can be combined to derive measures of relevance that are in good agreement with semantic similarity. Surprisingly, the traditional use of text similarity turns out to be ineffective for relevance ranking.
Starting Page	107
Ending Page	116
Page Count	10
File Format	PDF
ISBN	1595930469
DOI	10.1145/1060745.1060765
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	2005-05-10
Publisher Place	New York
Access Restriction	Subscribed
Subject Keyword	Content and link similarity Web mining Semantic similarity Ranking evaluation Web search
Content Type	Text
Resource Type	Article

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in