FutureGAN: Anticipating the Future Frames of Video Sequences using Spatio-Temporal 3d Convolutions in Progressively Growing GANs
October 02, 2018 Β· Entered Twilight Β· π The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
"Last commit was 7.0 years ago (β₯5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: FutureGAN_env.yml, ReadMe.md, custom_layers.py, data, eval.py, eval_metrics.py, imgsrc, model.py, optflow.py, plot_eval.py, tb_logger.py, train.py, utils.py, video_dataset.py
Authors
Sandra Aigner, Marco KΓΆrner
arXiv ID
1810.01325
Category
cs.CV: Computer Vision
Citations
78
Venue
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Repository
https://github.com/TUM-LMF/FutureGAN
β 82
Last Checked
2 months ago
Abstract
We introduce a new encoder-decoder GAN model, FutureGAN, that predicts future frames of a video sequence conditioned on a sequence of past frames. During training, the networks solely receive the raw pixel values as an input, without relying on additional constraints or dataset specific conditions. To capture both the spatial and temporal components of a video sequence, spatio-temporal 3d convolutions are used in all encoder and decoder modules. Further, we utilize concepts of the existing progressively growing GAN (PGGAN) that achieves high-quality results on generating high-resolution single images. The FutureGAN model extends this concept to the complex task of video prediction. We conducted experiments on three different datasets, MovingMNIST, KTH Action, and Cityscapes. Our results show that the model learned representations to transform the information of an input sequence into a plausible future sequence effectively for all three datasets. The main advantage of the FutureGAN framework is that it is applicable to various different datasets without additional changes, whilst achieving stable results that are competitive to the state-of-the-art in video prediction. Our code is available at https://github.com/TUM-LMF/FutureGAN.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Computer Vision
π
π
Old Age
π
π
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
R.I.P.
π»
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
π
π
Old Age
SSD: Single Shot MultiBox Detector
π
π
Old Age
Squeeze-and-Excitation Networks
R.I.P.
π»
Ghosted