Head and Body Orientation Estimation Using Convolutional Random Projection Forests

Donghoon Lee, Ming Hsuan Yang, Songhwai Oh

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

In this paper, we consider the problem of estimating the head pose and body orientation of a person from a low-resolution image. Under this setting, it is difficult to reliably extract facial features or detect body parts. We propose a convolutional random projection forest (CRPforest) algorithm for these tasks. A convolutional random projection network (CRPnet) is used at each node of the forest. It maps an input image to a high-dimensional feature space using a rich filter bank. The filter bank is designed to generate sparse responses so that they can be efficiently computed by compressive sensing. A sparse random projection matrix can capture most essential information contained in the filter bank without using all the filters in it. Therefore, the CRPnet is fast, e.g., it requires $0.04\;\mathrm{ms}$ to process an image of $50\times 50$ pixels, due to the small number of convolutions (e.g., 0.01 percent of a layer of a neural network) at the expense of less than 2 percent accuracy. The overall forest estimates head and body pose well on benchmark datasets, e.g., over 98 percent on the HIIT dataset, while requiring $3.8\;\mathrm{ms}$ without using a GPU. Extensive experiments on challenging datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in low-resolution images with noise, occlusion, and motion blur.

Original languageEnglish
Article number8219761
Pages (from-to)107-120
Number of pages14
JournalIEEE transactions on pattern analysis and machine intelligence
Volume41
Issue number1
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Random Projection
Filter banks
Filter Banks
Image resolution
Percent
Motion Blur
Convolution
Projection Matrix
Compressive Sensing
Feature Space
Pixels
Random Matrices
Occlusion
Neural networks
Person
High-dimensional
Pixel
Neural Networks
Filter
Benchmark

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Computational Theory and Mathematics
  • Artificial Intelligence
  • Applied Mathematics

Cite this

@article{e45c6acd030042c4b62c713f25fdda90,
title = "Head and Body Orientation Estimation Using Convolutional Random Projection Forests",
abstract = "In this paper, we consider the problem of estimating the head pose and body orientation of a person from a low-resolution image. Under this setting, it is difficult to reliably extract facial features or detect body parts. We propose a convolutional random projection forest (CRPforest) algorithm for these tasks. A convolutional random projection network (CRPnet) is used at each node of the forest. It maps an input image to a high-dimensional feature space using a rich filter bank. The filter bank is designed to generate sparse responses so that they can be efficiently computed by compressive sensing. A sparse random projection matrix can capture most essential information contained in the filter bank without using all the filters in it. Therefore, the CRPnet is fast, e.g., it requires $0.04\;\mathrm{ms}$ to process an image of $50\times 50$ pixels, due to the small number of convolutions (e.g., 0.01 percent of a layer of a neural network) at the expense of less than 2 percent accuracy. The overall forest estimates head and body pose well on benchmark datasets, e.g., over 98 percent on the HIIT dataset, while requiring $3.8\;\mathrm{ms}$ without using a GPU. Extensive experiments on challenging datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in low-resolution images with noise, occlusion, and motion blur.",
author = "Donghoon Lee and Yang, {Ming Hsuan} and Songhwai Oh",
year = "2019",
month = "1",
day = "1",
doi = "10.1109/TPAMI.2017.2784424",
language = "English",
volume = "41",
pages = "107--120",
journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
issn = "0162-8828",
publisher = "IEEE Computer Society",
number = "1",

}

Head and Body Orientation Estimation Using Convolutional Random Projection Forests. / Lee, Donghoon; Yang, Ming Hsuan; Oh, Songhwai.

In: IEEE transactions on pattern analysis and machine intelligence, Vol. 41, No. 1, 8219761, 01.01.2019, p. 107-120.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Head and Body Orientation Estimation Using Convolutional Random Projection Forests

AU - Lee, Donghoon

AU - Yang, Ming Hsuan

AU - Oh, Songhwai

PY - 2019/1/1

Y1 - 2019/1/1

N2 - In this paper, we consider the problem of estimating the head pose and body orientation of a person from a low-resolution image. Under this setting, it is difficult to reliably extract facial features or detect body parts. We propose a convolutional random projection forest (CRPforest) algorithm for these tasks. A convolutional random projection network (CRPnet) is used at each node of the forest. It maps an input image to a high-dimensional feature space using a rich filter bank. The filter bank is designed to generate sparse responses so that they can be efficiently computed by compressive sensing. A sparse random projection matrix can capture most essential information contained in the filter bank without using all the filters in it. Therefore, the CRPnet is fast, e.g., it requires $0.04\;\mathrm{ms}$ to process an image of $50\times 50$ pixels, due to the small number of convolutions (e.g., 0.01 percent of a layer of a neural network) at the expense of less than 2 percent accuracy. The overall forest estimates head and body pose well on benchmark datasets, e.g., over 98 percent on the HIIT dataset, while requiring $3.8\;\mathrm{ms}$ without using a GPU. Extensive experiments on challenging datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in low-resolution images with noise, occlusion, and motion blur.

AB - In this paper, we consider the problem of estimating the head pose and body orientation of a person from a low-resolution image. Under this setting, it is difficult to reliably extract facial features or detect body parts. We propose a convolutional random projection forest (CRPforest) algorithm for these tasks. A convolutional random projection network (CRPnet) is used at each node of the forest. It maps an input image to a high-dimensional feature space using a rich filter bank. The filter bank is designed to generate sparse responses so that they can be efficiently computed by compressive sensing. A sparse random projection matrix can capture most essential information contained in the filter bank without using all the filters in it. Therefore, the CRPnet is fast, e.g., it requires $0.04\;\mathrm{ms}$ to process an image of $50\times 50$ pixels, due to the small number of convolutions (e.g., 0.01 percent of a layer of a neural network) at the expense of less than 2 percent accuracy. The overall forest estimates head and body pose well on benchmark datasets, e.g., over 98 percent on the HIIT dataset, while requiring $3.8\;\mathrm{ms}$ without using a GPU. Extensive experiments on challenging datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in low-resolution images with noise, occlusion, and motion blur.

UR - http://www.scopus.com/inward/record.url?scp=85039784169&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85039784169&partnerID=8YFLogxK

U2 - 10.1109/TPAMI.2017.2784424

DO - 10.1109/TPAMI.2017.2784424

M3 - Article

C2 - 29990037

AN - SCOPUS:85039784169

VL - 41

SP - 107

EP - 120

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

IS - 1

M1 - 8219761

ER -