NDLI: Automatic Treebank Conversion via Informed Decoding

Please wait, while we are loading the content...

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 13

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 12

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 11

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 10

Issue 4, December 2011

Issue 3, September 2011

Introduction to the Special Issue on Chinese Language Processing

Automatic Treebank Conversion via Informed Decoding - A Case Study on Chinese Treebanks

Unified Semantic Role Labeling for Verbal and Nominal Predicates in the Chinese Language

Developing Position Structure-Based Framework for Chinese Entity Relation Extraction

Employing Constituent Dependency Information for Tree Kernel-Based Semantic Relation Extraction between Named Entities

Using Sublexical Translations to Handle the OOV Problem in Machine Translation

Issue 2, June 2011

Issue 1, March 2011

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 9

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 8

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 7

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 6

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 5

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 4

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 3

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 2

ACM Transactions on Asian Language Information Processing (TALIP) : Volume 1

Automatic Treebank Conversion via Informed Decoding - A Case Study on Chinese Treebanks

Content Provider	ACM Digital Library
Author	Zhu, Jingbo Xiao, Tong Zhu, Muhua
Copyright Year	2011
Abstract	Treebanks are valuable resources for syntactic parsing. For some languages such as Chinese, we can obtain multiple constituency treebanks which are developed by different organizations. However, due to discrepancies of underlying annotation standards, such treebanks in general cannot be used together through direct data combination. To enlarge training data for syntactic parsing, we focus in this article on the challenge of unifying standards of disparate treebanks by automatically converting one treebank (source treebank) to fit a different standard which is exhibited by another treebank (target treebank). We propose to convert a treebank in two sequential steps which correspond to the part-of-speech level and syntactic structure level (including tree structures and grammar labels), respectively. Approaches used in both levels can be unified as an informed decoding procedure, where information derived from original annotation in a source treebank is used to guide the conversion conducted by a POS tagger (or a parser in the syntactic structure level) trained on a target treebank. We take two Chinese treebanks as a case study, and experiments on these two treebanks show significant improvements in conversion accuracy over baseline systems, especially in situations where a target treebank is small in size.
Starting Page	1
Ending Page	24
Page Count	24
File Format	PDF
ISSN	15300226
e-ISSN	15583430
DOI	10.1145/2002980.2002982
Volume Number	10
Issue Number	3
Journal	ACM Transactions on Asian Language Information Processing (TALIP)
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	2011-09-01
Publisher Place	New York
Access Restriction	One Nation One Subscription (ONOS)
Subject Keyword	Chinese POS tagging Chinese syntactic parsing Informed decoding Treebank conversion
Content Type	Text
Resource Type	Article
Subject	Computer Science

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in

Automatic treebank conversion via informed decoding.

A maximum-entropy chinese parser augmented by transformation-based learning

Study on architectures for chinese pos tagging and parsing.

A separately passive-aggressive training algorithm for joint pos tagging and dependency parsing.

A Separately Passive-Aggressive Training Algorithm for Joint POS Tagging and Dependency Parsing

Exploiting heterogeneous treebanks for parsing (2009)

Exploiting multiple treebanks for parsing with quasisynchronous grammars (2012)

A feature-based approach to better automatic treebank conversion

Treebanks gone bad : Parser evaluation and retraining using a treebank of ungrammatical sentences

Automatic Treebank Conversion via Informed Decoding - A Case Study on Chinese Treebanks

Similar Documents

Automatic treebank conversion via informed decoding.

A maximum-entropy chinese parser augmented by transformation-based learning

Study on architectures for chinese pos tagging and parsing.

A separately passive-aggressive training algorithm for joint pos tagging and dependency parsing.

A Separately Passive-Aggressive Training Algorithm for Joint POS Tagging and Dependency Parsing

Exploiting heterogeneous treebanks for parsing (2009)

Exploiting multiple treebanks for parsing with quasisynchronous grammars (2012)

A feature-based approach to better automatic treebank conversion

Treebanks gone bad : Parser evaluation and retraining using a treebank of ungrammatical sentences

Automatic Treebank Conversion via Informed Decoding - A Case Study on Chinese Treebanks