NDLI: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion (iV&L-MM '16)

Please wait, while we are loading the content...

Exploiting Scene Context for Image Captioning

News Event Understanding by Mining Latent Factors From Multimodal Tensors

Jointly Representing Images and Text: Dependency Graphs, Word Senses, and Multimodal Embeddings

Multimodal and Crossmodal Representation Learning from Textual and Visual Features with Bidirectional Deep Neural Networks for Video Hyperlinking

Beyond Language and Vision, Towards Truly Multimedia Integration

Cross-modal Classification by Completing Unimodal Representations

User Video Summarization Based on Joint Visual and Semantic Affinity Graph

Semantic Indexing of Wearable Camera Images: Kids'Cam Concepts

Disinformation in Multimedia Annotation: Misleading Metadata Detection on YouTube

Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion (iV&L-MM '16)

Content Provider	ACM Digital Library
Editor	Moens, Marie-Francine Tuytelaars, Tinne Pastra, Katerina Saenko, Kate
Copyright Year	2016
Abstract	It is our great pleasure to welcome you to the ACM Multimedia 2016 Workshop Vision and Language Integration Meets Multimedia Fusion (iV&L-MM 2016) in Amsterdam, The Netherlands on October 16, 2016. Multimodal information fusion both at the signal and the semantics levels is a core part in most multimedia applications, including multimedia indexing, retrieval, summarization and others. Early or late fusion of modality-specific processing results has been addressed in multimedia prototypes since their very early days, through various methodologies including rule-based approaches, information-theoretic models and machine learning. Vision and Language are two of the predominant modalities that are being fused and which have attracted special attention in international challenges with a long history of results, such as TRECVid, ImageClef and others. During the last decade, vision-language semantic integration has attracted attention from traditionally non-interdisciplinary research communities, such as Computer Vision and Natural Language Processing. This is due to the fact that one modality can greatly assist the processing of another providing cues for disambiguation, complementary information and noise/error filtering. The latest boom of deep learning methods has opened up new directions in joint modelling of visual and co-occurring verbal information in multimedia discourse. The proceedings contain seven selected long papers, which have been orally presented at the workshop, and three abstracts of the invited keynote speeches. The papers and abstracts discuss data collection, representation learning, deep learning approaches, matrix and tensor factorization methods and graph based clustering with regard to the fusion of multimedia data. A variety of applications is presented including image captioning, summarization of news, video hyperlinking, sub-shot segmentation of user generated video, cross-modal classification, cross-modal questionanswering, and the detection of misleading metadata of user generated video. The call for papers attracted submissions from Europe, Asia, Australia and the United States. We received 15 long papers of which the program committee reviewed and accepted 7, resulting in an acceptance rate of about 47%. The accepted long papers are orally presented at the workshop. We also encourage attendees to attend the keynote talk presentations. These valuable and insightful talks can and will guide us to a better understanding of the future: Explain and Answer: Relating Natural Language and Visual Recognition, Marcus Rohrbach (University of California Berkeley, USA) Jointly Representing Images and Text: Dependency Graphs, Word Senses, and Multimodal Embeddings, Frank Keller (University of Edinburgh, UK) Beyond Language and Vision, Towards Truly Multimedia Integration, Tat-Seng Chua (National University of Singapore, Singapore).
ISBN	9781450345194
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	2016-10-16
Access Restriction	Subscribed
Content Type	Text
Resource Type	Conference Proceedings

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in