Learning hierarchical image representation with sparsity, saliency and locality

Jimei Yang, Ming Hsuan Yang

Research output: Contribution to conferencePaperpeer-review

14 Citations (Scopus)

Abstract

This paper presents a deep learning model of building up hierarchical image representation. Each layer of hierarchy consists of three components: sparse coding, saliency pooling and local grouping. With sparse coding we identify distinctive coefficients for representing raw features of each lower layer; saliency pooling helps suppress noise and enhance translation invariance of sparse representation; we group locally pooled sparse codes to form more complex representations. Instead of using hand-crafted descriptors, our model learns an effective image representation directly from images in a unsuper-vised data-driven manner. We evaluate our algorithm with several benchmark databases of object recognition and analyze the contributions of different components. Experimental results show that our algorithm performs favorably against the state-of-the-art methods.

Original languageEnglish
DOIs
Publication statusPublished - 2011
Event2011 22nd British Machine Vision Conference, BMVC 2011 - Dundee, United Kingdom
Duration: 2011 Aug 292011 Sept 2

Conference

Conference2011 22nd British Machine Vision Conference, BMVC 2011
Country/TerritoryUnited Kingdom
CityDundee
Period11/8/2911/9/2

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Learning hierarchical image representation with sparsity, saliency and locality'. Together they form a unique fingerprint.

Cite this