Loading...
Please wait, while we are loading the content...
Learn how to make BERT smaller and faster
| Content Provider | Semantic Scholar |
|---|---|
| Copyright Year | 2019 |
| Abstract | Despite their superb accuracy, the huge models are difficult to use in practice. Pre-trained models typically need to be fine-tuned before use, which is very resource-hungry due to the large number of parameters. Things get even worse when serving the fine-tuned model. It requires a lot of memory and time to process a single message. What is a state-of-the-art model good for, if the resulting chatbot will only be able to handle one message per second? To make them ready for scale, we seek to accelerate today’s well-performing language models, in this case, by compressing them. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | https://persagen.com/files/misc/rasa2019learn.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |