Abstract
We address character grounding and re-identification in multiple story-based videos like movies and associated text descriptions. In order to solve these related tasks in a mutually rewarding way, we propose a model named Character in Story Identification Network (CiSIN). Our method builds two semantically informative representations via joint training of multiple objectives for character grounding, video/text re-identification and gender prediction: Visual Track Embedding from videos and Textual Character Embedding from text context. These two representations are learned to retain rich semantic multimodal information that enables even simple MLPs to achieve the state-of-the-art performance on the target tasks. More specifically, our CiSIN model achieves the best performance in the Fill-in the Characters task of LSMDC 2019 challenges[35]. Moreover, it outperforms previous state-of-the-art models in M-VAD Names dataset [30] as a benchmark of multimodal character grounding and re-identification.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2020 - 16th European Conference, Proceedings |
Editors | Andrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 543-559 |
Number of pages | 17 |
ISBN (Print) | 9783030585570 |
DOIs | |
Publication status | Published - 2020 |
Event | 16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom Duration: 2020 Aug 23 → 2020 Aug 28 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 12350 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 16th European Conference on Computer Vision, ECCV 2020 |
---|---|
Country/Territory | United Kingdom |
City | Glasgow |
Period | 20/8/23 → 20/8/28 |
Bibliographical note
Funding Information:Acknowledgement. We thank SNUVL lab members for helpful comments. This research was supported by Seoul National University, Brain Research Program by National Research Foundation of Korea (NRF) (2017M3C7A1047860), and AIR Lab (AI Research Lab) in Hyundai Motor Company through HMC-SNU AI Consortium Fund.
Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)