Learning to Recognize Three-Dimensional Objects

Dan Roth, Ming Hsuan Yang, Narendra Ahuja

Research output: Contribution to journalArticle

45 Citations (Scopus)

Abstract

A learning account for the problem of object recognition is developed within the probably approximately correct (PAC) model of learnability. The key assumption underlying this work is that objects can be recognized (or discriminated) using simple representations in terms of syntactically simple relations over the raw image. Although the potential number of these simple relations could be huge, only a few of them are actually present in each observed image, and a fairly small number of those observed are relevant to discriminating an object. We show that these properties can be exploited to yield an efficient learning approach in terms of sample and computational complexity within the PAC model. No assumptions are needed on the distribution of the observed objects, and the learning performance is quantified relative to its experience. Most important, the success of learning an object representation is naturally tied to the ability to represent it as a function of some intermediate representations extracted from the image. We evaluate this approach in a large-scale experimental study in which the SNoW learning architecture is used to learn representations for the 100 objects in the Columbia Object Image Library. Experimental results exhibit good generalization and robustness properties of the SNoW-based method relative to other approaches. SNoW's recognition rate degrades more gracefully when the training data contains fewer views, and it shows similar behavior in some preliminary experiments with partially occluded objects.

Original languageEnglish
Pages (from-to)1071-1103
Number of pages33
JournalNeural Computation
Volume14
Issue number5
DOIs
Publication statusPublished - 2002 May 1

All Science Journal Classification (ASJC) codes

  • Arts and Humanities (miscellaneous)
  • Cognitive Neuroscience

Fingerprint Dive into the research topics of 'Learning to Recognize Three-Dimensional Objects'. Together they form a unique fingerprint.

  • Cite this