Loading...
Please wait, while we are loading the content...
Similar Documents
Joint feature selection in distributed stochastic learning for large-scale discriminative training in SMT (2012)
| Content Provider | CiteSeerX |
|---|---|
| Author | Riezler, Stefan Dyer, Chris Simianer, Patrick |
| Description | In Proc. ACL |
| Abstract | With a few exceptions, discriminative training in statistical machine translation (SMT) has been content with tuning weights for large feature sets on small development data. Evidence from machine learning indicates that increasing the training sample size results in better prediction. The goal of this paper is to show that this common wisdom can also be brought to bear upon SMT. We deploy local features for SCFG-based SMT that can be read off from rules at runtime, and present a learning algorithm that applies ℓ1/ℓ2 regularization for joint feature selection over distributed stochastic learning processes. We present experiments on learning on 1.5 million training sentences, and show significant improvements over tuning discriminative models on small development sets. 1 |
| File Format | |
| Publisher Date | 2012-01-01 |
| Access Restriction | Open |
| Subject Keyword | Small Development Set Discriminative Training Small Development Data Statistical Machine Translation Training Sample Size Result Common Wisdom Significant Improvement Learning Algorithm Joint Feature Selection Local Feature Discriminative Model Large-scale Discriminative Training Present Experiment Stochastic Learning Scfg-based Smt Large Feature Set Training Sentence Stochastic Learning Process |
| Content Type | Text |
| Resource Type | Article |