Learning to Adapt Structured Output Space for Semantic Segmentation

Yi Hsuan Tsai, Wei Chih Hung, Samuel Schulter, Kihyuk Sohn, Ming Hsuan Yang, Manmohan Chandraker

Research output: Chapter in Book/Report/Conference proceedingConference contribution

55 Citations (Scopus)

Abstract

Convolutional neural network-based approaches for semantic segmentation rely on supervision with pixel-level ground truth, but may not generalize well to unseen image domains. As the labeling process is tedious and labor intensive, developing algorithms that can adapt source ground truth labels to the target domain is of great interest. In this paper, we propose an adversarial learning method for domain adaptation in the context of semantic segmentation. Considering semantic segmentations as structured outputs that contain spatial similarities between the source and target domains, we adopt adversarial learning in the output space. To further enhance the adapted model, we construct a multi-level adversarial network to effectively perform output space domain adaptation at different feature levels. Extensive experiments and ablation study are conducted under various domain adaptation settings, including synthetic-to-real and cross-city scenarios. We show that the proposed method performs favorably against the state-of-the-art methods in terms of accuracy and visual quality.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
PublisherIEEE Computer Society
Pages7472-7481
Number of pages10
ISBN (Electronic)9781538664209
DOIs
Publication statusPublished - 2018 Dec 14
Event31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 - Salt Lake City, United States
Duration: 2018 Jun 182018 Jun 22

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
CountryUnited States
CitySalt Lake City
Period18/6/1818/6/22

Fingerprint

Semantics
Ablation
Labeling
Labels
Pixels
Personnel
Neural networks
Experiments

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Tsai, Y. H., Hung, W. C., Schulter, S., Sohn, K., Yang, M. H., & Chandraker, M. (2018). Learning to Adapt Structured Output Space for Semantic Segmentation. In Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 (pp. 7472-7481). [8578878] (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). IEEE Computer Society. https://doi.org/10.1109/CVPR.2018.00780
Tsai, Yi Hsuan ; Hung, Wei Chih ; Schulter, Samuel ; Sohn, Kihyuk ; Yang, Ming Hsuan ; Chandraker, Manmohan. / Learning to Adapt Structured Output Space for Semantic Segmentation. Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018. IEEE Computer Society, 2018. pp. 7472-7481 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).
@inproceedings{e925b1df16764362abe542bc14ea1d72,
title = "Learning to Adapt Structured Output Space for Semantic Segmentation",
abstract = "Convolutional neural network-based approaches for semantic segmentation rely on supervision with pixel-level ground truth, but may not generalize well to unseen image domains. As the labeling process is tedious and labor intensive, developing algorithms that can adapt source ground truth labels to the target domain is of great interest. In this paper, we propose an adversarial learning method for domain adaptation in the context of semantic segmentation. Considering semantic segmentations as structured outputs that contain spatial similarities between the source and target domains, we adopt adversarial learning in the output space. To further enhance the adapted model, we construct a multi-level adversarial network to effectively perform output space domain adaptation at different feature levels. Extensive experiments and ablation study are conducted under various domain adaptation settings, including synthetic-to-real and cross-city scenarios. We show that the proposed method performs favorably against the state-of-the-art methods in terms of accuracy and visual quality.",
author = "Tsai, {Yi Hsuan} and Hung, {Wei Chih} and Samuel Schulter and Kihyuk Sohn and Yang, {Ming Hsuan} and Manmohan Chandraker",
year = "2018",
month = "12",
day = "14",
doi = "10.1109/CVPR.2018.00780",
language = "English",
series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",
publisher = "IEEE Computer Society",
pages = "7472--7481",
booktitle = "Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018",
address = "United States",

}

Tsai, YH, Hung, WC, Schulter, S, Sohn, K, Yang, MH & Chandraker, M 2018, Learning to Adapt Structured Output Space for Semantic Segmentation. in Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018., 8578878, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, pp. 7472-7481, 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, United States, 18/6/18. https://doi.org/10.1109/CVPR.2018.00780

Learning to Adapt Structured Output Space for Semantic Segmentation. / Tsai, Yi Hsuan; Hung, Wei Chih; Schulter, Samuel; Sohn, Kihyuk; Yang, Ming Hsuan; Chandraker, Manmohan.

Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018. IEEE Computer Society, 2018. p. 7472-7481 8578878 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Learning to Adapt Structured Output Space for Semantic Segmentation

AU - Tsai, Yi Hsuan

AU - Hung, Wei Chih

AU - Schulter, Samuel

AU - Sohn, Kihyuk

AU - Yang, Ming Hsuan

AU - Chandraker, Manmohan

PY - 2018/12/14

Y1 - 2018/12/14

N2 - Convolutional neural network-based approaches for semantic segmentation rely on supervision with pixel-level ground truth, but may not generalize well to unseen image domains. As the labeling process is tedious and labor intensive, developing algorithms that can adapt source ground truth labels to the target domain is of great interest. In this paper, we propose an adversarial learning method for domain adaptation in the context of semantic segmentation. Considering semantic segmentations as structured outputs that contain spatial similarities between the source and target domains, we adopt adversarial learning in the output space. To further enhance the adapted model, we construct a multi-level adversarial network to effectively perform output space domain adaptation at different feature levels. Extensive experiments and ablation study are conducted under various domain adaptation settings, including synthetic-to-real and cross-city scenarios. We show that the proposed method performs favorably against the state-of-the-art methods in terms of accuracy and visual quality.

AB - Convolutional neural network-based approaches for semantic segmentation rely on supervision with pixel-level ground truth, but may not generalize well to unseen image domains. As the labeling process is tedious and labor intensive, developing algorithms that can adapt source ground truth labels to the target domain is of great interest. In this paper, we propose an adversarial learning method for domain adaptation in the context of semantic segmentation. Considering semantic segmentations as structured outputs that contain spatial similarities between the source and target domains, we adopt adversarial learning in the output space. To further enhance the adapted model, we construct a multi-level adversarial network to effectively perform output space domain adaptation at different feature levels. Extensive experiments and ablation study are conducted under various domain adaptation settings, including synthetic-to-real and cross-city scenarios. We show that the proposed method performs favorably against the state-of-the-art methods in terms of accuracy and visual quality.

UR - http://www.scopus.com/inward/record.url?scp=85055087851&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055087851&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2018.00780

DO - 10.1109/CVPR.2018.00780

M3 - Conference contribution

AN - SCOPUS:85055087851

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 7472

EP - 7481

BT - Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018

PB - IEEE Computer Society

ER -

Tsai YH, Hung WC, Schulter S, Sohn K, Yang MH, Chandraker M. Learning to Adapt Structured Output Space for Semantic Segmentation. In Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018. IEEE Computer Society. 2018. p. 7472-7481. 8578878. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). https://doi.org/10.1109/CVPR.2018.00780