NDLI: Unsupervised determination of efficient Korean LVCSR units using a Bayesian Dirichlet process model

Content Provider	IEEE Xplore Digital Library
Author	Sakti, S. Finch, A. Isotani, R. Kawai, H. Nakamura, S.
Copyright Year	2011
Description	Author affiliation: Spoken Language Communication Research Group, MASTAR Project, National Institute of Information and Communications Technology (NICT), Japan (Sakti, S.; Finch, A.; Isotani, R.; Kawai, H.; Nakamura, S.)
Abstract	Korean is an agglutinative language that does not have explicit word boundaries. It is also a highly inflective language that exhibits severe coarticulation effects. These characteristics pose a challenge in developing large-vocabulary continuous speech recognition (LVCSR) systems. Many existing Korean LVCSR systems attempt to overcome these difficulties by defining a set of “word” units using morphological analysis (rule-based) or statistical methods. These approaches usually require a great deal of linguistic knowledge or at least some explicit information about the statistical distribution of the units. However, exceptions or uncommon words (e.g., foreign proper nouns) still exist that cannot be covered by rules alone. In this paper, we investigate the use of an unsupervised, nonparametric Bayesian approach to automatically determining efficient units for a Korean LVCSR system. Specifically, we utilize a Dirichlet process model trained using Bayesian inference through block Gibbs sampling. Our approach provides a principled way of learning units without explicit linguistic knowledge or any static parameters. Experiments were conducted on a travel domain corpus, which includes many foreign words and proper nouns. In our experiments we compared our method to a set of state-of-the-art baseline systems that relied on either morphological analysis or segmentation heuristics. Our system was able to produce a considerably more compact set of “word” units than the best baseline system (the lexical dictionary was approximately half the size), with a recognition accuracy 5.89% higher in terms of the relative word error rate than the best baseline system.
Starting Page	4664
Ending Page	4667
File Size	125402
Page Count	4
File Format	PDF
ISBN	9781457705380
ISSN	15206149
e-ISBN	9781457705397
e-ISBN	9781457705373
DOI	10.1109/ICASSP.2011.5947395
Language	English
Publisher	Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher Date	2011-05-22
Publisher Place	Czech Republic
Access Restriction	Subscribed
Rights Holder	Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subject Keyword	Bayesian methods Hidden Markov models Dictionaries Speech Accuracy Training Speech recognition Gibbs sampling Korean language large-vocabulary continuous speech recognition unsupervised segmentation nonparametric Bayesian approach Dirichlet process model
Content Type	Text
Resource Type	Article

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in

Initial Experiments with Tamil LVCSR

Unsupervised acoustic model training for the Korean language

Linguistic stem concatenation for malay large vocabulary continuous speech recognition

Large vocabulary continuous speech recognition with context-dependent DBN-HMMS

On modeling non-word events in Large Vocabulary Continuous Speech Recognition

Segmentation-based Mongolian LVCSR approach

Turkish Large Vocabulary Continuous Speech Recognition by using limited audio corpus

Complete Recognition of Continuous Mandarin Speech for Chinese Language with Very Large Vocabulary Using Limited Training Data (1997)

On the segmentation of switching autoregressive processes by nonparametric Bayesian methods

Unsupervised determination of efficient Korean LVCSR units using a Bayesian Dirichlet process model

Similar Documents

Initial Experiments with Tamil LVCSR

Unsupervised acoustic model training for the Korean language

Linguistic stem concatenation for malay large vocabulary continuous speech recognition

Large vocabulary continuous speech recognition with context-dependent DBN-HMMS

On modeling non-word events in Large Vocabulary Continuous Speech Recognition

Segmentation-based Mongolian LVCSR approach

Turkish Large Vocabulary Continuous Speech Recognition by using limited audio corpus

Complete Recognition of Continuous Mandarin Speech for Chinese Language with Very Large Vocabulary Using Limited Training Data (1997)

On the segmentation of switching autoregressive processes by nonparametric Bayesian methods

Unsupervised determination of efficient Korean LVCSR units using a Bayesian Dirichlet process model