NDLI: TM-LDA: efficient online modeling of latent topic transitions in social media

Please wait, while we are loading the content...

Panel on Mining the Big Data

Modeling Content and Users: Structured Probabilistic Representation and Scalable Inference Algorithms

Nine real hard problems we'd like you to solve

Rise and fall patterns of information diffusion: model and implications

Finding minimum representative pattern sets

The contextual focused topic model

A simple methodology for soft cost-sensitive classification

Discovering regions of different functions in a city using human mobility and POIs

Interaction and collective intelligence in internet computing

Learning from crowds in the presence of schools of thought

Searching and mining trillions of time series subsequences under dynamic time warping

Accelerated singular value thresholding for matrix completion

A sparsity-inducing formulation for evolutionary co-clustering

A framework for summarizing and analyzing twitter feeds

China's national personal credit scoring system: a real-life intelligent knowledge application

Efficient and domain-invariant competitor mining

Mining event periodicity from incomplete observations

Optimal exact least squares rank minimization

A structural cluster kernel for learning on graphs

Scalable misbehavior detection in online video chat services

Semantic search and a new moore's law effect in knowledge engineering

Magnet community identification on social networks

Discovering lag intervals for temporal dependencies

Transparent user models for personalization

Parallel field ranking

Estimating conversion rate in display advertising from past erformance data

Algorithms for mining uncertain graph data

Review spam detection via temporal pattern discovery

Integrating community matching and outlier detection for mining evolutionary community outliers

Robust multi-task feature learning

Maximum inner-product search using cone trees

Empowering authors to diagnose comprehension burden in textbooks

Ensembles and model delivery for tax compliance

Capacitated team formation problem on social networks

Differential identifiability

Web image prediction using multivariate point processes

Open domain event extraction from twitter

An integrated data mining approach to real-time clinical monitoring and deterioration warning

Experiences and lessons in developing industry-strength machine learning and data mining software

Joint optimization of bid and budget allocation in sponsored search

Streaming graph partitioning for large distributed graphs

Circle-based recommendation in online social networks

Two approaches to understanding when constraints help clustering

Design principles of massive, robust prediction systems

Social media data analysis for revealing collective behaviors

Large-scale learning of word relatedness with constraints

An enhanced relevance criterion for more concise supervised pattern discovery

Mining discriminative components with low-rank and sparsity constraints for face recognition

BC-PDM: data mining, social network analysis and text mining system based on cloud computing

Exact Primitives for Time Series Data Mining

Mining heterogeneous information networks: the next frontier

Efficient personalized pagerank with accuracy assurance

Mining emerging patterns by streaming feature selection

Practical collapsed variational bayes inference for hierarchical dirichlet process

Intelligible models for classification and regression

Constructing popular routes from uncertain trajectories

Building an engine for big data

Social sampling

Fast mining and forecasting of complex time-stamped events

Fast bregman divergence NMF using taylor expansion and coordinate descent

Detecting changes of clustering structures using normalized maximum likelihood coding

Entity-centric topic-oriented opinion summarization in twitter

Maximizing return and minimizing cost with the right decision management systems

Discriminative clustering for market segmentation

Towards heterogeneous temporal clinical event pattern discovery: a convolutional approach

Large-scale distributed non-negative sparse coding and sparse dictionary learning

Multi-label hypothesis reuse

Keyword-propagation-based information enriching and noise removal for web news videos

Key lessons learned building recommender systems for large-scale social networks

Vertex neighborhoods, low conductance cuts, and good seeds for local community methods

Testing the significance of spatio-temporal teleconnection patterns

Estimating entity importance via counting set covers

Semi-supervised learning with mixed knowledge information

Multimedia features for click prediction of new ads in display advertising

Experience with discovering knowledge by acquiring it

Selecting a characteristic set of reviews

Different slopes for different folks: mining for exceptional regression models with cook's distance

Unsupervised feature selection for linked social media data

A probabilistic model for multimodal hash function learning

Coupled behavior analysis for capturing coupling relationships in group-based market manipulations

Leveraging predictive modeling to reduce signal theft in a multi-service organization environment

Finding trendsetters in information networks

Anonymizing set-valued data by nonreciprocal recoding

Transductive multi-label ensemble classification for protein function prediction

Stratified k-means clustering over a deep web data source

Multi-source learning for joint analysis of incomplete multi-modality neuroimaging data

The untold story of the clones: content-agnostic factors that impact YouTube video popularity

RolX: structural role extraction & mining in large graphs

Incorporating heterogeneous information for personalized tag recommendation in social tagging systems

Chromatic correlation clustering

PatentMiner: topic-driven patent analysis and mining

Information processing in social networks

Latent association analysis of document pairs

Efficient frequent item counting in multi-core hardware

Fast algorithms for comprehensive n-point correlation estimates

Query-driven discovery of semantically similar substructures in heterogeneous networks

Efficient Algorithms for Detecting Genetic Interactions in Genome-Wide Association Study

Divide-and-conquer and statistical inference for big data

PageRank on an evolving graph

Linear space direct pattern sampling using coupling from the past

Overlapping decomposition for causal graphical modeling

NASA: achieving lower regrets and faster rates via adaptive stepsizes

GetJar mobile application recommendations with very sparse datasets

A new challenge of information processing under the 21st century

From user comments to on-line conversations

Mining recent temporal patterns for event detection in multivariate time series data

GigaTensor: scaling tensor analysis up by 100 times - algorithms and discoveries

Subspace correlation clustering: finding locally correlated dimensions in subspace projections of the data

Community discovery and profiling with social messages

Interacting viruses in networks: can both survive?

The long and the short of it: summarising event sequences with serial episodes

Learning binary codes for collaborative filtering

Rank-loss support instance machines for MIML instance annotation

Harnessing the wisdom of the crowds for accurate web page clipping

Overlapping community detection via bounded nonnegative matrix tri-factorization

SeqiBloc: mining multi-time spanning blockmodels in dynamic graphs

ComSoc: adaptive transfer of user behaviors over composite social network

Batch mode active sampling based on marginal probability distribution matching

Trustworthy online controlled experiments: five puzzling outcomes explained

Bayesian relational data analysis

Mining contentions from discussions and debates

A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data

Model mining for robust feature selection

On socio-spatial group query for location-based social networks

HySAD: a semi-supervised hybrid shilling attack detector for trustworthy product recommendation

Towards social user profiling: unified and discriminative influence model for inferring home locations

Adversarial support vector machine learning

Multi-domain active learning for text classification

Metro maps of science

RainMon: an integrated approach to mining bursty timeseries monitoring data

SHALE: an efficient algorithm for allocation of guaranteed display advertising

Fast algorithms for maximal clique enumeration with limited memory

Cross-domain collaboration recommendation

Locally-scaled spectral clustering using empty region graphs

Storytelling in entity networks to support intelligence analysts

Understanding users' satisfaction for search engine evaluation

LIEGE:: link entities in web lists with knowledge base

On nested palindromes in clickstream data

On "one of the few" objects

DAGger: clustering correlated uncertain data (to predict asset failure in energy networks)

Experiments in social computation: (and the data they generate)

Information diffusion and external influence in networks

Mining top-K high utility itemsets

TM-LDA: efficient online modeling of latent topic transitions in social media

Learning in non-stationary environments with class imbalance

Differentially private transit data publication: a case study on the montreal transportation system

Developing data mining applications

eTrust: understanding trust evolution in an online world

A shapelet transform for time series classification

Active learning for online bayesian matrix factorization

Dependency clustering across measurement scales

Finding trending local topics in search queries for personalization of a recommendation system

Aggregating web offers to determine product prices

Efficient event pattern matching with match windows

Low rank modeling of signed networks

Inductive multi-task learning with multiple view data

Bootstrapped language identification for multi-site internet domains

DEMON: a local-first discovery method for overlapping communities

USpan: an efficient algorithm for mining high utility sequential patterns

Online learning to diversify from implicit feedback

SPF-GMKL: generalized multiple kernel learning with a million kernels

Position-normalized click prediction in search advertising

Cross-media knowledge discovery

Discovering value from community activity on focused question answering sites: a case study of stack overflow

Intrusion as (anti)social communication: characterization and detection

Feature grouping and selection over an undirected graph

Random forests for metric learning with implicit pairwise position dependence

Following the electrons: methods for power management in commercial buildings

Event-based social networks: linking the online and offline social worlds

Modeling disease progression via fused sparse group lasso

Active sampling for entity matching

SympGraph: a framework for mining clinical notes through symptom relation graphs

Factoring past exposure in display advertising targeting

Summarization-based mining bipartite graphs

RecMax: exploiting recommender systems for fun and profit

Active spectral clustering via iterative uncertainty reduction

A framework for robust discovery of entity synonyms

Similarity search in real world networks

Automatic taxonomy construction from keywords

UFIMT: an uncertain frequent itemset mining toolbox

The missing models: a data-driven approach for learning how networks grow

Sampling minimal frequent boolean (DNF) patterns

Multi-view clustering using mixture models in subspace projections

Linear support vector machines via dual cached loops

On the separability of structural classes of communities

Mining large-scale, sparse GPS traces for map inference: comparison of approaches

Playlist prediction via metric embedding

Efficient evaluation of large sequence kernels

Bid optimizing and inventory scoring in targeted online advertising

Online allocation of display ads with smooth delivery

Mining coherent subgraphs in multi-layer graphs with edge labels

Learning personal + social latent factor model for social recommendation

Integrating meta-path selection with user-guided object clustering in heterogeneous information networks

SmartDispatch: enabling efficient ticket dispatch in an IT service environment

Visual exploration of collaboration networks based on graph degeneracy

TourViz: interactive visualization of connection pathways in large graphs

D-INDEX: a web environment for analyzing dependences among scientific collaborators

Information propagation game: a tool to acquire humanplaying data for multiplayer influence maximization on social networks

MoodLens: an emoticon-based sentiment analysis system for chinese tweets

Intelligent advertising framework for digital signage

AssocExplorer: an association rule visualization system for exploratory data analysis

GeoSearch: georeferenced video retrieval system

Siren: an interactive tool for mining and visualizing geospatial redescriptions

Navigating information facets on twitter (NIF-T)

HeteRecom: a semantic-based recommendation system in heterogeneous networks

VOXSUP: a social engagement framework

A system for extracting top-K lists from the web

EventSearch: a system for event discovery and retrieval on multi-type historical data

EvaPlanner: an evacuation planner with social-based flocking kinetics

PubMed search and exploration with real-time semantic network construction

TM-LDA: efficient online modeling of latent topic transitions in social media

Content Provider	ACM Digital Library
Author	Agichtein, Eugene Wang, Yu Benzi, Michele
Abstract	Latent topic analysis has emerged as one of the most effective methods for classifying, clustering and retrieving textual data. However, existing models such as Latent Dirichlet Allocation (LDA) were developed for static corpora of relatively large documents. In contrast, much of the textual content on the web, and especially social media, is temporally sequenced, and comes in short fragments, including microblog posts on sites such as Twitter and Weibo, status updates on social networking sites such as Facebook and LinkedIn, or comments on content sharing sites such as YouTube. In this paper we propose a novel topic model, Temporal-LDA or TM-LDA, for efficiently mining text streams such as a sequence of posts from the same author, by modeling the topic transitions that naturally arise in these data. TM-LDA learns the transition parameters among topics by minimizing the prediction error on topic distribution in subsequent postings. After training, TM-LDA is thus able to accurately predict the expected topic distribution in future posts. To make these predictions more efficient for a realistic online setting, we develop an efficient updating algorithm to adjust the topic transition parameters, as new documents stream in. Our empirical results, over a corpus of over 30 million microblog posts, show that TM-LDA significantly outperforms state-of-the-art static LDA models for estimating the topic distribution of new documents over time. We also demonstrate that TM-LDA is able to highlight interesting variations of common topic transitions, such as the differences in the work-life rhythm of cities, and factors associated with area-specific problems and complaints.
Starting Page	123
Ending Page	131
Page Count	9
File Format	PDF MP4
ISBN	9781450314626
DOI	10.1145/2339530.2339552
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	2012-08-12
Publisher Place	New York
Access Restriction	Subscribed
Subject Keyword	Temporal language models Mining social media data Topic transition modeling
Content Type	Audio Text
Resource Type	Article

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in