Loading...
Please wait, while we are loading the content...
Similar Documents
UvA-DARE ( Digital Academic Repository ) Early Embedding and Late Reranking for Video
| Content Provider | Semantic Scholar |
|---|---|
| Author | Dong, Jianfeng Li, Xirong Lan, Weiyu Huo, Yujia Snoek, Cees G. M. |
| Copyright Year | 2016 |
| Abstract | This paper describes our solution for the MSR Video to Language Challenge. We start from the popular ConvNet + LSTMmodel, which we extend with two novel modules. One is early embedding, which enriches the current low-level input to LSTM by tag embeddings. The other is late reranking, for re-scoring generated sentences in terms of their relevance to a specific video. The modules are inspired by recent works on image captioning, repurposed and redesigned for video. As experiments on the MSR-VTT validation set show, the joint use of these two modules add a clear improvement over a non-trivial ConvNet + LSTM baseline under four performance metrics. The viability of the proposed solution is further confirmed by the blind test by the organizers. Our system is ranked at the 4th place in terms of overall performance, while scoring the best CIDEr-D, which measures the human-likeness of generated captions. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | https://pure.uva.nl/ws/files/8080878/DongICMR2016.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |