DiFaReli++: Diffusion Face Relighting with Consistent Cast Shadows

April 19, 2023 · Declared Dead · 🏛 ICCV 2023

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Puntawat Ponglertnapakorn, Nontawat Tritrong, Supasorn Suwajanakorn arXiv ID 2304.09479 Category cs.CV: Computer Vision Cross-listed cs.GR, cs.LG Citations 0 Venue ICCV 2023 Last Checked 4 months ago

Abstract

We introduce a novel approach to single-view face relighting in the wild, addressing challenges such as global illumination and cast shadows. A common scheme in recent methods involves intrinsically decomposing an input image into 3D shape, albedo, and lighting, then recomposing it with the target lighting. However, estimating these components is error-prone and requires many training examples with ground-truth lighting to generalize well. Our work bypasses the need for accurate intrinsic estimation and can be trained solely on 2D images without any light stage data, relit pairs, multi-view images, or lighting ground truth. Our key idea is to leverage a conditional diffusion implicit model (DDIM) for decoding a disentangled light encoding along with other encodings related to 3D shape and facial identity inferred from off-the-shelf estimators. We propose a novel conditioning technique that simplifies modeling the complex interaction between light and geometry. It uses a rendered shading reference along with a shadow map, inferred using a simple and effective technique, to spatially modulate the DDIM. Moreover, we propose a single-shot relighting framework that requires just one network pass, given pre-processed data, and even outperforms the teacher model across all metrics. Our method realistically relights in-the-wild images with temporally consistent cast shadows under varying lighting conditions. We achieve state-of-the-art performance on the standard benchmark Multi-PIE and rank highest in user studies.