MULTI-FRAME VIDEO PREDICTION WITH LEARNABLE MOTION ENCODINGS

Rakesh Jasti, Varun Jampani, Deqing Sun, Ming Hsuan Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Predicting multiple future frames from a given video is a challenging problem due to several factors such as changing camera, dynamically moving objects, occlusions, etc. While recent deep learning methods have made significant progress on the video prediction problem, most methods predict the immediate or a fixed number of future frames. To obtain longer-term frame predictions, existing techniques usually process the predicted frames iteratively, resulting in blurry or inconsistent predictions. In this work, we present a new approach that can predict an arbitrary number of future video frames with a single forward pass through the network. Instead of directly predicting a fixed number of future optical flows or frames, we learn temporal motion encodings, i.e., temporal motion basis vectors and a network to predict the coefficients. The learned motion basis can be easily extended to arbitrary length at inference time, enabling us to predict an arbitrary number of future frames. Experiments on benchmark datasets show that our approach performs favorably against several competitive techniques even for the next frame prediction setting. When evaluated under 5-frame or 10-frame prediction settings, the proposed method achieves higher performance gains over the state-of-the-art techniques that iteratively process the predictions.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Image Processing, ICIP 2022 - Proceedings
PublisherIEEE Computer Society
Pages4198-4202
Number of pages5
ISBN (Electronic)9781665496209
DOIs
Publication statusPublished - 2022
Event29th IEEE International Conference on Image Processing, ICIP 2022 - Bordeaux, France
Duration: 2022 Oct 162022 Oct 19

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880

Conference

Conference29th IEEE International Conference on Image Processing, ICIP 2022
Country/TerritoryFrance
CityBordeaux
Period22/10/1622/10/19

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing

Fingerprint

Dive into the research topics of 'MULTI-FRAME VIDEO PREDICTION WITH LEARNABLE MOTION ENCODINGS'. Together they form a unique fingerprint.

Cite this