NDLI: Reducing Query Latencies in Web Search Using Fine-Grained Parallelism

Please wait, while we are loading the content...

Reducing Query Latencies in Web Search Using Fine-Grained Parallelism

Content Provider	Springer Nature Link
Author	Frachtenberg, Eitan
Copyright Year	2009
Abstract	Semantic Web search is a new application of recent advances in information retrieval (IR), natural language processing, artificial intelligence, and other fields. The Powerset group in Microsoft develops a semantic search engine that aims to answer queries not only by matching keywords, but by actually matching meaning in queries to meaning in Web documents. Compared to typical keyword search, semantic search can pose additional engineering challenges for the back-end and infrastructure designs. Of these, the main challenge addressed in this paper is how to lower query latencies to acceptable, interactive levels. Index-based semantic search requires more data processing, such as numerous synonyms, hypernyms, multiple linguistic readings, and other semantic information, both on queries and in the index. In addition, some of the algorithms can be super-linear, such as matching co-references across a document. Consequently, many semantic queries can run significantly slower than the same keyword query. Users, however, have grown to expect Web search engines to provide near-instantaneous results, and a slow search engine could be deemed unusable even if it provides highly relevant results. It is therefore imperative for any search engine to meet its users’ interactivity expectations, or risk losing them. Our approach to tackle this challenge is to exploit data parallelism in slow search queries to reduce their latency in multi-core systems. Although all search engines are designed to exploit parallelism, at the single-node level this usually translates to throughput-oriented task parallelism. This paper focuses on the engineering of two latency-oriented approaches (coarse- and fine-grained) and compares them to the task-parallel approach. We use Powerset’s deployed search engine to evaluate the various factors that affect parallel performance: workload, overhead, load balancing, and resource contention. We also discuss heuristics to selectively control the degree of parallelism and consequent overhead on a query-by-query level. Our experimental results show that using fine-grained parallelism with these dynamic heuristics can significantly reduce query latencies compared to fixed, coarse-granularity parallelization schemes. Although these results were obtained on, and optimized for, Powerset’s semantic search, they can be readily generalized to a wide class of inverted-index search engines.
Starting Page	441
Ending Page	460
Page Count	20
File Format	PDF
ISSN	1386145X
Journal	World Wide Web
Volume Number	12
Issue Number	4
e-ISSN	15731413
Language	English
Publisher	Springer US
Publisher Date	2009-07-17
Publisher Place	Boston
Access Restriction	One Nation One Subscription (ONOS)
Subject Keyword	semantic web search engines performance evaluation multi-core processors parallel algorithms Operating Systems Database Management Information Systems Applications (incl.Internet)
Content Type	Text
Resource Type	Article
Subject	Computer Networks and Communications Software Hardware and Architecture

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in

Reducing query latencies in web search using fine-grained parallelism (2009).

Three-Level Caching for Efficient Query Processing in Large Web Search Engines

Query-Free News Search

Ws-AC: A Fine Grained Access Control System for Web Services

Algorithmic Computation and Approximation of Semantic Similarity

Finding Related Search Engine Queries by Web Community Based Query Enrichment

Semantic flooding : Semantic search across distributed lightweight ontologies

DIASPORA: A highly distributed web-query processing system

Exploiting fine-grained parallelism in graph traversal algorithms via lock virtualization on multi-core architecture

Reducing Query Latencies in Web Search Using Fine-Grained Parallelism

Similar Documents

Reducing query latencies in web search using fine-grained parallelism (2009).

Three-Level Caching for Efficient Query Processing in Large Web Search Engines

Query-Free News Search

Ws-AC: A Fine Grained Access Control System for Web Services

Algorithmic Computation and Approximation of Semantic Similarity

Finding Related Search Engine Queries by Web Community Based Query Enrichment

Semantic flooding : Semantic search across distributed lightweight ontologies

DIASPORA: A highly distributed web-query processing system

Exploiting fine-grained parallelism in graph traversal algorithms via lock virtualization on multi-core architecture

Reducing Query Latencies in Web Search Using Fine-Grained Parallelism