NDLI: Cross-lingual geo-parsing for non-structured data

Please wait, while we are loading the content...

Every move you make I'll be watching you: geographical focus detection on Twitter

A ranking measure for top-k moving object trajectories search

Structured toponym resolution using combined hierarchical place categories

Assessment of the accuracy of GeoNames gazetteer data

Evidential location estimation for events detected in Twitter

Criteria of query-independent page significance in geospatial web search

Construction of a Japanese gazetteers for Japanese local toponym disambiguation

Creating test collections from user generated content for GIR evaluation

Extracting spatiotemporal and semantic events from documents

Prototyping a personalized contextual retrieval framework

Cross-lingual geo-parsing for non-structured data

Geographical queries beyond conventional boundaries: regional search and exploration

Semantic extraction of geographic data from web tables for big data integration

Linking context and proximity through web corpus

GeoTxt: a web API to leverage place references in text

Where the streets have no name: experiences in GIR for a developing country

Cross-lingual geo-parsing for non-structured data

Content Provider	ACM Digital Library
Author	Gelernter, Judith Zhang, Wei
Abstract	A geo-parser automatically identifies location words in a text. We have generated a geo-parser specifically to find locations in unstructured Spanish text. Our novel geo-parser architecture combines the results of four parsers: a lexico-semantic Named Location Parser, a rules-based building parser, a rules-based street parser, and a trained Named Entity Parser. Each parser has different strengths: the Named Location Parser is strong in recall, and the Named Entity Parser is strong in precision, and building and street parser finds buildings and streets that the others are not designed to do. To test our Spanish geo-parser performance, we compared the output of Spanish text through our Spanish geo-parser, with that same Spanish text translated into English and run through our English geo-parser. The results were that the Spanish geo-parser identified toponyms with an F1 of .796, and the English geo-parser identified toponyms with an F1 of .861 (and this is despite errors introduced by translation from Spanish to English), compared to an F1 of .114 from a commercial off-the-shelf Spanish geo-parser. Results suggest (1) geo-parsers should be built specifically for unstructured text, as have our Spanish and English geo-parsers, and (2) location entities in Spanish that have been machine translated to English are robust to geo-parsing in English.
Starting Page	64
Ending Page	71
Page Count	8
File Format	PDF
ISBN	9781450322416
DOI	10.1145/2533888.2533943
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	2013-11-05
Publisher Place	New York
Access Restriction	Subscribed
Subject Keyword	Geo-parse Translation Cross-language geographic information retrieval (clgir) Geo-reference Microtext Twitter Spanish Location
Content Type	Text
Resource Type	Article

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in