Loading...
Please wait, while we are loading the content...
Similar Documents
IEEE TRANSACTIONS ON MULTIMEDIA 1 Semantic Model Vectors for Complex Video Event Recognition
| Content Provider | CiteSeerX |
|---|---|
| Author | Merler, Michele Huang, Bert Xie, Lexing Hua, Gang Natsev, Apostol |
| Abstract | ate level semantic representation, as a basis for modeling and detecting complex events in unconstrained real-world videos, such as those from YouTube. The Semantic Model Vectors are extracted using a set of discriminative semantic classifiers, each being an ensemble of SVM models trained from thousands of labeled web images, for a total of 280 generic concepts. Our study reveals that the proposed Semantic Model Vectors representation outperforms—and is complementary to—other low-level visual descriptors for video event modeling. We hence present an end-to-end video event detection system, which combines Semantic Model Vectors with other static or dynamic visual descriptors, extracted at the frame, segment or full clip level. We perform a comprehensive empirical study on the 2010 TRECVID Multime-dia Event Detection task1, which validates the Semantic Model Vectors representation not only as the best individual descriptor, outperforming state-of-the-art global and local static features as well as spatio-temporal HOG and HOF descriptors, but also as the most compact. We also study early and late feature fusion across the various approaches, leading to a 15 % performance boost and an overall system performance of 0.46 Mean Average Precision. In order to promote further research in this direction, we made our Semantic Model Vectors for the TRECVID MED 2010 set publicly available for the community to use2. Index Terms—complex video events, high level descriptor, event recognition. I. |
| File Format | |
| Access Restriction | Open |
| Subject Keyword | Semantic Model Vector Ieee Transaction Multimedia Complex Video Event Recognition Hof Descriptor Index Term Complex Video Event Trecvid Multime-dia Event Detection Task1 Complex Event Trecvid Med Various Approach Semantic Model Vector Representation Outperforms Individual Descriptor Unconstrained Real-world Video End-to-end Video Event Detection System Video Event Modeling Spatio-temporal Hog Labeled Web Image High Level Descriptor Full Clip Level Semantic Model Vector Representation Comprehensive Empirical Study Performance Boost Late Feature Fusion Dynamic Visual Descriptor Overall System Performance Study Reveals Low-level Visual Descriptor Local Static Feature Mean Average Precision Discriminative Semantic Classifier Generic Concept Svm Model Ate Level Semantic Representation Event Recognition |
| Content Type | Text |