Weakly-supervised video scene co-parsing

Guangyu Zhong, Yi Hsuan Tsai, Ming Hsuan Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this paper, we propose a scene co-parsing framework to assign pixel-wise semantic labels in weakly-labeled videos, i.e., only video-level category labels are given. To exploit rich semantic information, we first collect all videos that share the same video-level labels and segment them into supervoxels. We then select representative supervoxels for each category via a supervoxel ranking process. This ranking problem is formulated with a submodular objective function and a scene-object classifier is incorporated to distinguish scenes and objects. To assign each supervoxel a semantic label, we match each supervoxel to these selected representatives in the feature domain. Each supervoxel is then associated with a series of category potentials and assigned to a semantic label with the maximum one. The proposed co-parsing framework extends scene parsing from single images to videos and exploits mutual information among a video collection. Experimental results on the Wild-8 and SUNY-24 datasets show that the proposed algorithm performs favorably against the state-of-the-art approaches.

Original languageEnglish
Title of host publicationComputer Vision - ACCV 2016 - 13th Asian Conference on Computer Vision, Revised Selected Papers
EditorsYoichi Sato, Ko Nishino, Vincent Lepetit, Shang-Hong Lai
PublisherSpringer Verlag
Pages20-36
Number of pages17
ISBN (Print)9783319541808
DOIs
Publication statusPublished - 2017
Event13th Asian Conference on Computer Vision, ACCV 2016 - Taipei, Taiwan, Province of China
Duration: 2016 Nov 202016 Nov 24

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10111 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other13th Asian Conference on Computer Vision, ACCV 2016
Country/TerritoryTaiwan, Province of China
City Taipei
Period16/11/2016/11/24

Bibliographical note

Funding Information:
This work is supported in part by the NSF CAREER grant #1149783, NSF IIS grant #1152576, and gifts from Adobe and Nvidia. G. Zhong is sponsored by China Scholarship Council and NSFC grant #61572099.

Publisher Copyright:
© Springer International Publishing AG 2017.

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Weakly-supervised video scene co-parsing'. Together they form a unique fingerprint.

Cite this