NDLI: A generalized hidden Markov model with discriminative training for query spelling correction

Please wait, while we are loading the content...

Salton award lecture: information retrieval as engineering science

Adaptation of the concept hierarchy model with search logs for query recommendation on intranets

Privacy-aware image classification and search

Diversity by proportionality: an election-based approach to search result diversification

Time-based calibration of effectiveness measures

Combining inverted indices and structured search for ad-hoc object retrieval

TFMAP: optimizing MAP for top-n context-aware recommendation

Modeling the impact of short- and long-term behavior on search personalization

Efficient in-memory top-k document retrieval

Studies of the onset and persistence of medical concerns in search logs

Mining query subtopics from search log data

Efficient query recommendations in the long tail via center-piece subgraphs

Detecting quilted web pages at scale

Explanatory semantic relatedness and explicit spatialization for exploratory search

Image ranking based on user browsing behavior

Predicting the ratings of multimedia items for making personalized recommendations

Automatic refinement of patent queries using concept importance predictors

Modeling user posting behavior on social media

Automatic suggestion of query-rewrite rules for enterprise search

Learning to predict response times for online query scheduling

Learning to rank social update streams

See-to-retrieve: efficient processing of spatio-visual keyword queries

Mining the web for points of interest

Structural relationships for large-scale learning of answer re-ranking

Dual role model for question recommendation in community question answering

Content-based retrieval for heterogeneous domains: domain adaptation by relative aggregation points

Personalized diversification of search results

Quality through flow and immersion: gamifying crowdsourced relevance assessments

Improving retrieval of short texts through document expansion

Confidence-aware graph regularization with heterogeneous pairwise features

A knowledge-based approach for summarising opinions

A framework for manipulating and searching multiple retrieval types

IR paradigms in computational advertising

A hybrid model for ad-hoc information retrieval

Beyond bag-of-words: machine learning for query-document matching in web search

Retrieving information from the book of humanity: the personalized medicine data tsunami crashes on the beach of jeopardy

Adaptive query suggestion for difficult queries

Manhattan hashing for large-scale image retrieval

Explicit relevance models in intent-oriented information retrieval diversification

Time drives interaction: simulating sessions in diverse searching environments

Retrieving similar discussion forum threads: a structure based approach

Increasing temporal diversity with purchase intervals

Improving searcher models using mouse cursor activity

Index maintenance for time-travel text search

A semi-supervised approach to modeling web search satisfaction

Search, interrupted: understanding and predicting search task continuation

Supporting efficient top-k queries in type-ahead search

Fighting against web spam: a novel propagation method based on click-through data

A subjunctive exploratory search interface to support media studies researchers

Modeling concept dynamics for large scale music search

Personalized click shaping through lagrangian duality for online recommendation

Automatic term mismatch diagnosis for selective query expansion

Friend or frenemy?: predicting signed ties in social networks

Time-sensitive query auto-completion

Prefetching query results and its impact on search engines

Collaborative personalized tweet recommendation

Placing images on the world map: a microblog-based enrichment approach

TwiNER: named entity recognition in targeted twitter stream

Top-k learning to rank: labeling, ranking and evaluation

Vote calibration in community question-answering systems

Mixture model with multiple centralized retrieval algorithms for result merging in federated search

Combining implicit and explicit topic representations for result diversification

An IR-based evaluation framework for web search query segmentation

Extending BM25 with multiple query operators

A utility-theoretic ranking method for semi-automated text classification

Adaptive IR for exploratory search support

A visual tool for bayesian data analysis: the impact of smoothing on naive bayes text classifiers

Watson: the Jeopardy! challenge and beyond

Exploiting paths for entity search in RDF graphs

Methods for mining and summarizing text conversations

Learning to suggest: a machine learning framework for ranking query suggestions

Boosting multi-kernel locality-sensitive hashing for scalable image retrieval

AspecTiles: tile-based visualization of diversified web search results

Evaluating aggregated search pages

Summarizing highly structured documents for effective search interaction

Adaptive diversification of recommendation results via latent factor portfolio

Personalization of search results using interaction behaviors in search sessions

Optimizing positional index structures for versioned document collections

Social annotations: utility and prediction modeling

Multi-aspect query summarization by composite query

SimFusion+: extending simfusion towards efficient estimation on large and dynamic networks

Learning hash codes for efficient content reuse detection

Task complexity, vertical display and user interaction in aggregated search

Finding translations in scanned book collections

What reviews are satisfactory: novel features for automatic helpfulness voting

Generating reformulation trees for complex queries

Social-network analysis using topic models

A generalized hidden Markov model with discriminative training for query spelling correction

Online result cache invalidation for real-time web search

Exploring social influence for recommendation: a generative model approach

Where is who: large-scale photo retrieval by facial attributes and canvas layout

Adaptive context features for toponym resolution in streaming news

Robust ranking models via risk-sensitive optimization

Category hierarchy maintenance: a data-driven approach

Reactive index replication for distributed search engines

Using preference judgments for novel document retrieval

On per-topic variance in IR evaluation

Rhetorical relations for information retrieval

Improving tweet stream classification by detecting changes in word probability

Adversarial content manipulation effects

ALF: a client side logger and server for capturing user interactions in web applications

Putting context into search and search into context

A study of term weighting schemes using class information for text classification

Crowdsourcing for search evaluation and social-algorithmic search

User evaluation of query quality

To index or not to index: time-space trade-offs in search engines with positional ranking functions

An exploration of ranking heuristics in mobile local search

Language intent models for inferring user browsing behavior

Group matrix factorization for scalable topic modeling

Proximity-based rocchio's model for pseudo relevance

Cognos: crowdsourcing search for topic experts in microblogs

When web search fails, searchers become askers: understanding the transition

An uncertainty-aware query selection model for evaluation of IR systems

Modeling higher-order term dependencies in information retrieval using query hypergraphs

Predicting quality flaws in user-generated content: the case of wikipedia

Building reputation and trust using federated search and opinion mining

ChatNoir: a search engine for the ClueWeb09 corpus

CloudSearch and the democratization of information retrieval

A topic model of clinical reports

(Big) usage data in web search

Enhancing knowledge base with knowledge transfer

CrowdTerrier: automatic crowdsourced relevance assessments with terrier

Entity sentiment extraction using text ranking

Active query selection for learning rankers

A new look at old tricks: the fertile roots of current research

Improving e-discovery using information retrieval

Distilling and exploring nuggets from a corpus

Anticipatory search: using context to initiate search

Aspect-based opinion mining from product reviews

Opinion influence and diffusion in social network

Integrative online research-data management

BReK12: a book recommender for K-12 users

Experimental methods for information retrieval

Relevance as a subjective and situational multidimensional concept

MaSe: create your own mash-up search interface

Clarity re-visited

IR models: foundations and relationships

Exploiting temporal topic models in social media retrieval

myDJ: recommending karaoke songs from one's own voice

Cluster-based one-class ensemble for classification problems in information retrieval

Patent information retrieval: an instance of domain-specific search

The essence of time: considering temporal relevance as an intent-aware ranking problem

PageFetch: a retrieval game for children (and adults)

Collaborative filtering with short term preferences mining

Medical information retrieval: an instance of domain-specific search

Pictune: situational music recommendation from geotagged pictures

Creating temporally dynamic web search snippets

Visual information retrieval using Java and LIRE

Political search trends

Dependency trigram model for social relation extraction from news articles

Large-scale graph mining and learning for information retrieval

RDF Xpress: a flexible expressive RDF search engine

Detecting candidate named entities in search queries

Query performance prediction for IR

Sketch-based image similarity search with a pen and paper interface

Effect of dynamic pruning safety on learning to rank effectiveness

Collaborative information seeking: art and science of achieving 1+1>2 in IR

Task-aware search assistant

Effect of written instructions on assessor agreement

Advances on the development of evaluation measures

TweetSpector: entity-based retrieval of tweets

Effects of expertise differences in synchronous social Q&A

YooSee: a video browsing application for young children

Efficient estimation of aspect weights

Multi-platform image search using tag enrichment

Emotion tagging for comments of online news by meta classification with heterogeneous information sources

Estimating the magic barrier of recommender systems: a user study

Explaining neighborhood-based recommendations

Exploiting term dependence while handling negation in medical search

Exploring example-based person search in email

Exploring tag relevance for image tag re-ranking

Fast on-line learning for multilingual categorization

Finding interesting posts in Twitter based on retweet graph analysis

Finding readings for scientists from social websites

Finding web appearances of social network users via latent factor model

Fixed versus dynamic co-occurrence windows in TextRank term weights for information retrieval

Gender-aware re-ranking

Genre classification for million song dataset using confidence-based classifiers combination

GLASE 0.1: eyes tell more than mice

How query extensions reflect search result abandonments

Identifying entity aspects in microblog posts

Impact of assessor disagreement on ranking performance

Incorporating statistical topic information in relevance feedback

Inferring missing relevance judgments from crowd workers via probabilistic matrix factorization

Investigating performance predictors using monte carlo simulation and score distribution models

Learning to select a time-aware retrieval model

Learning-based time-sensitive re-ranking for web search

Lightweight contrastive summarization for news comment mining

Looking inside the box: context-sensitive translation for cross-language information retrieval

Making results fit into 40 characters: a study in document rewriting

New assessment criteria for query suggestion

On automatically tagging web documents from examples

On building a reusable Twitter corpus

On judgments obtained from a commercial search engine

On the mathematical relationship between expected n-call@k and the relevance vs. diversity trade-off

On real-time ad-hoc retrieval evaluation

Opinion summarisation through sentence extraction: an investigation with movie reviews

Optimizing parameters of the expected reciprocal rank

Ousting ivory tower research: towards a web framework for providing experiments as a service

Parallelizing ListNet training using spark

Predicting lifespans of popular tweets in microblog

Preliminary study of technical terminology for the retrieval of scientific book metadata records

Queries without clicks: evaluating retrieval effectiveness based on user feedback

Retrieval evaluation on focused tasks

Rewarding term location information to enhance probabilistic information retrieval

Scheduling queries across replicas

Re-examining search result snippet examination time for relevance estimation

Sentiment identification by incorporating syntax, semantics and context information

Short text classification using very few words

Summarizing the differences from microblogs

Survival analysis of click logs

Text selections as implicit relevance feedback

Time to judge relevance as an indicator of assessor error

Towards alias detection without string similarity: an active learning based approach

Towards zero-click mobile IR evaluation: knowing what and knowing when

Twanchor text: a preliminary study of the value of tweets as anchor text

Unsupervised linear score normalization revisited

User-aware caching and prefetching query results in web search engines

Using eye-tracking with dynamic areas of interest for analyzing interactive information retrieval

Using PageRank to infer user preferences

Utilizing inter-document similarities in federated search

Want a coffee?: predicting users' trails

Will this #hashtag be popular tomorrow?

\$100,000 prize jackpot. call now!: identifying the pertinent features of SMS spam

A generalized hidden Markov model with discriminative training for query spelling correction

Content Provider	ACM Digital Library
Author	Duan, Huizhong Li, Yanen Zhai, ChengXiang
Abstract	Query spelling correction is a crucial component of modern search engines. Existing methods in the literature for search query spelling correction have two major drawbacks. First, they are unable to handle certain important types of spelling errors, such as concatenation and splitting. Second, they cannot efficiently evaluate all the candidate corrections due to the complex form of their scoring functions, and a heuristic filtering step must be applied to select a working set of top-K most promising candidates for final scoring, leading to non-optimal predictions. In this paper we address both limitations and propose a novel generalized Hidden Markov Model with discriminative training that can not only handle all the major types of spelling errors, including splitting and concatenation errors, in a single unified framework, but also efficiently evaluate all the candidate corrections to ensure the finding of a globally optimal correction. Experiments on two query spelling correction datasets demonstrate that the proposed generalized HMM is effective for correcting multiple types of spelling errors. The results also show that it significantly outperforms the current approach for generating top-K candidate corrections, making it a better first-stage filter to enable any other complex spelling correction algorithm to have access to a better working set of candidate corrections as well as to cover splitting and concatenation errors, which no existing method in academic literature can correct.
Starting Page	611
Ending Page	620
Page Count	10
File Format	PDF
ISBN	9781450314725
DOI	10.1145/2348283.2348365
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	2012-08-12
Publisher Place	New York
Access Restriction	Subscribed
Subject Keyword	Generalized hidden markov models Discriminative training for hmms Query spelling correction
Content Type	Text
Resource Type	Article

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in