NDLI: Extracting information networks from the blogosphere

Please wait, while we are loading the content...

Extracting information networks from the blogosphere

Content Provider	ACM Digital Library
Author	Merhav, Yuval Mesquita, Filipe Frieder, Ophir Barbosa, Denilson Yee, Wai Gen
Copyright Year	2012
Abstract	We study the problem of automatically extracting information networks formed by recognizable entities as well as relations among them from social media sites. Our approach consists of using state-of-the-art natural language processing tools to identify entities and extract sentences that relate such entities, followed by using text-clustering algorithms to identify the relations within the information network. We propose a new term-weighting scheme that significantly improves on the state-of-the-art in the task of relation extraction, both when used in conjunction with the standard $\textit{tf}$ ċ $\textit{idf}$ scheme and also when used as a pruning filter. We describe an effective method for identifying benchmarks for open information extraction that relies on a curated online database that is comparable to the hand-crafted evaluation datasets in the literature. From this benchmark, we derive a much larger dataset which mimics realistic conditions for the task of open information extraction. We report on extensive experiments on both datasets, which not only shed light on the accuracy levels achieved by state-of-the-art open information extraction tools, but also on how to tune such tools for better results.
Starting Page	1
Ending Page	33
Page Count	33
File Format	PDF
ISSN	15591131
e-ISSN	1559114X
DOI	10.1145/2344416.2344418
Volume Number	6
Issue Number	3
Journal	ACM Transactions on the Web (TWEB)
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	2012-10-02
Publisher Place	New York
Access Restriction	One Nation One Subscription (ONOS)
Subject Keyword	Clustering Domain frequency Named entities Open information extraction Relation extraction
Content Type	Text
Resource Type	Article
Subject	Computer Networks and Communications

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in

Extracting related named entities from blogosphere for event mining

VN-KIM IE: Automatic Extraction of Vietnamese Named-Entities on the Web

Employing Constituent Dependency Information for Tree Kernel-Based Semantic Relation Extraction between Named Entities

WebSets: extracting sets of entities from the web using unsupervised information extraction

Extracting named entity translingual equivalence with limited resources

Semantic clustering of relations between named entities

Relation extraction and the influence of automatic named-entity recognition

A relation extraction method of Chinese named entities based on location and semantic features

Extracting information networks from the blogosphere (2012)

Extracting information networks from the blogosphere

Similar Documents

Extracting related named entities from blogosphere for event mining

VN-KIM IE: Automatic Extraction of Vietnamese Named-Entities on the Web

Employing Constituent Dependency Information for Tree Kernel-Based Semantic Relation Extraction between Named Entities

WebSets: extracting sets of entities from the web using unsupervised information extraction

Extracting named entity translingual equivalence with limited resources

Semantic clustering of relations between named entities

Relation extraction and the influence of automatic named-entity recognition

A relation extraction method of Chinese named entities based on location and semantic features

Extracting information networks from the blogosphere (2012)

Extracting information networks from the blogosphere