Recently, learned image compression methods have out-performed traditional hand-crafted ones including BPG. One of the keys to this success is learned entropy models that estimate the probability distribution of the quantized latent representation. Like other vision tasks, most recent learned entropy models are based on convolutional neural networks (CNNs). However, CNNs have a limitation in modeling long-range dependencies due to their nature of local connectivity, which can be a significant bottleneck in image compression where reducing spatial redundancy is a key point. To overcome this issue, we propose a novel entropy model called Information Transformer (Informer) that exploits both global and local information in a content-dependent manner using an attention mechanism. Our experiments show that Informer improves rate-distortion performance over the state-of-the-art methods on the Kodak and Tecnick datasets without the quadratic computational complexity problem. Our source code is available at https://github.com/naver-ai/informer.
|Title of host publication||Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022|
|Publisher||IEEE Computer Society|
|Number of pages||10|
|Publication status||Published - 2022|
|Event||2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 - New Orleans, United States|
Duration: 2022 Jun 19 → 2022 Jun 24
|Name||Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition|
|Conference||2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022|
|Period||22/6/19 → 22/6/24|
Bibliographical noteFunding Information:
All experiments were conducted on the NAVER Smart Machine Learning (NSML) platform . This work was supported by the NRF grant funded by the Korea government (MSIT) (2021R1A2C2011474).
© 2022 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition