Loading...
Please wait, while we are loading the content...
Similar Documents
Neural Optimizers with Hypergradients for Tuning Parameter-Wise Learning Rates
| Content Provider | Semantic Scholar |
|---|---|
| Author | Ng, Ritchie Chen, Danlu Ilievski, Ilija Chua, T.-S. Chen, Fu Chua, Ilievski Pal |
| Copyright Year | 2017 |
| Abstract | Recent studies show that LSTM-based neural optimizers are competitive with state-of-theart hand-designed optimization methods for short horizons. Existing neural optimizers learn how to update the optimizee parameters, namely, predicting the product of learning rates and gradients directly and we suspect it is the reason why the training task becomes unnecessarily difficult. Instead, we train a neural optimizer to only control the learning rates of another optimizer using gradients of the training loss with respect to the learning rates. Furthermore, with the assumption that learning rates tend to remain unchanged over a certain number of iterations, the neural optimizer is only allowed to propose learning rates every S iterations where the learning rates are fixed during these S iterations and this enables it to generalize to longer horizons. The optimizee is trained by Adam on MNIST, and our neural optimizer learns to tune the learning rates for the Adam. After 5 meta-iterations, another optimizee trained by Adam whose learning rates are tuned by the learned but frozen neural optimizer, outperforms those trained by existing hand-designed and learned neural optimizers in terms of convergence rate and final accuracy for long horizons across several datasets. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | https://ilija139.github.io/pub/automl.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |