Existing saliency models typically incorporate contexts holistically. However, for each pixel, usually only part of its context region contributes to saliency prediction, while other parts are likely either noise or distractions. In this paper, we propose a novel pixel-wise contextual attention network (PiCANet) to selectively attend to informative context locations at each pixel. The proposed PiCANet generates an attention map over the contextual region of each pixel and construct attentive contextual features via selectively incorporating the features of useful context locations. We present three formulations of the PiCANet via embedding the pixel-wise contextual attention mechanism into the pooling and convolution operations with attending to global or local contexts. All the three models are fully differentiable and can be integrated with convolutional neural networks with joint training. In this work, we introduce the proposed PiCANets into a U-Net model for salient object detection. The generated global and local attention maps can learn to incorporate global contrast and regional smoothness, which help localize and highlight salient objects more accurately and uniformly. Experimental results show that the proposed PiCANets perform effectively for saliency detection against the state-of-the-art methods. Furthermore, we demonstrate the effectiveness and generalization ability of the PiCANets on semantic segmentation and object detection with improved performance.
Bibliographical noteFunding Information:
Manuscript received July 24, 2019; revised February 12, 2020; accepted April 6, 2020. Date of publication April 23, 2020; date of current version July 2, 2020. This work was supported in part by the National Science Foundation of China under Grant U1801265, in part by the Key Research and Development Program of Guangdong Province under Grant 2019B010110001, in part by the Research Funds for Interdisciplinary Subject, NWPU, and in part by NSF CAREER under Grant 1149783. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Jing-Ming Guo. (Corresponding author: Junwei Han.) Nian Liu is with the School of Automation, Northwestern Polytechnical University, Xi’an 710072, China, and also with the Department of Engagement Services, Mohamed Bin Zayed University of Artificial Intelligence, AbuDhabi, United Arab Emirates (e-mail: firstname.lastname@example.org).
© 1992-2012 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Graphics and Computer-Aided Design