Estimating human pose from occluded images

Jia Bin Huang, Ming Hsuan Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as ι1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.

Original languageEnglish
Title of host publicationComputer Vision, ACCV 2009 - 9th Asian Conference on Computer Vision, Revised Selected Papers
Pages48-50
Number of pages3
EditionPART 1
DOIs
Publication statusPublished - 2010 Dec 29
Event9th Asian Conference on Computer Vision, ACCV 2009 - Xi'an, China
Duration: 2009 Sep 232009 Sep 27

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume5994 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th Asian Conference on Computer Vision, ACCV 2009
CountryChina
CityXi'an
Period09/9/2309/9/27

Fingerprint

Convex optimization
Feature extraction
Occlusion
Linear Combination
Norm
Shape Feature
Pose Estimation
Silhouette
Nonlinear Regression
Sparse Representation
Training Samples
Human
Convex Optimization
Estimate
Feature Selection
Histogram
Gradient
Optimization Problem
Experimental Results

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Huang, J. B., & Yang, M. H. (2010). Estimating human pose from occluded images. In Computer Vision, ACCV 2009 - 9th Asian Conference on Computer Vision, Revised Selected Papers (PART 1 ed., pp. 48-50). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5994 LNCS, No. PART 1). https://doi.org/10.1007/978-3-642-12307-8_5
Huang, Jia Bin ; Yang, Ming Hsuan. / Estimating human pose from occluded images. Computer Vision, ACCV 2009 - 9th Asian Conference on Computer Vision, Revised Selected Papers. PART 1. ed. 2010. pp. 48-50 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1).
@inproceedings{d2dd155c5cc14e51b293b2b05c3db05e,
title = "Estimating human pose from occluded images",
abstract = "We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as ι1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.",
author = "Huang, {Jia Bin} and Yang, {Ming Hsuan}",
year = "2010",
month = "12",
day = "29",
doi = "10.1007/978-3-642-12307-8_5",
language = "English",
isbn = "3642123066",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 1",
pages = "48--50",
booktitle = "Computer Vision, ACCV 2009 - 9th Asian Conference on Computer Vision, Revised Selected Papers",
edition = "PART 1",

}

Huang, JB & Yang, MH 2010, Estimating human pose from occluded images. in Computer Vision, ACCV 2009 - 9th Asian Conference on Computer Vision, Revised Selected Papers. PART 1 edn, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 1, vol. 5994 LNCS, pp. 48-50, 9th Asian Conference on Computer Vision, ACCV 2009, Xi'an, China, 09/9/23. https://doi.org/10.1007/978-3-642-12307-8_5

Estimating human pose from occluded images. / Huang, Jia Bin; Yang, Ming Hsuan.

Computer Vision, ACCV 2009 - 9th Asian Conference on Computer Vision, Revised Selected Papers. PART 1. ed. 2010. p. 48-50 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5994 LNCS, No. PART 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Estimating human pose from occluded images

AU - Huang, Jia Bin

AU - Yang, Ming Hsuan

PY - 2010/12/29

Y1 - 2010/12/29

N2 - We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as ι1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.

AB - We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as ι1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.

UR - http://www.scopus.com/inward/record.url?scp=78650474676&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78650474676&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-12307-8_5

DO - 10.1007/978-3-642-12307-8_5

M3 - Conference contribution

AN - SCOPUS:78650474676

SN - 3642123066

SN - 9783642123061

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 48

EP - 50

BT - Computer Vision, ACCV 2009 - 9th Asian Conference on Computer Vision, Revised Selected Papers

ER -

Huang JB, Yang MH. Estimating human pose from occluded images. In Computer Vision, ACCV 2009 - 9th Asian Conference on Computer Vision, Revised Selected Papers. PART 1 ed. 2010. p. 48-50. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1). https://doi.org/10.1007/978-3-642-12307-8_5