Text-Guided Texturing by Synchronized Multi-View Diffusion

November 21, 2023 · Declared Dead · 🏛 ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Yuxin Liu, Minshan Xie, Hanyuan Liu, Tien-Tsin Wong arXiv ID 2311.12891 Category cs.CV: Computer Vision Citations 78 Venue ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia Last Checked 1 month ago

Abstract

This paper introduces a novel approach to synthesize texture to dress up a given 3D object, given a text prompt. Based on the pretrained text-to-image (T2I) diffusion model, existing methods usually employ a project-and-inpaint approach, in which a view of the given object is first generated and warped to another view for inpainting. But it tends to generate inconsistent texture due to the asynchronous diffusion of multiple views. We believe such asynchronous diffusion and insufficient information sharing among views are the root causes of the inconsistent artifact. In this paper, we propose a synchronized multi-view diffusion approach that allows the diffusion processes from different views to reach a consensus of the generated content early in the process, and hence ensures the texture consistency. To synchronize the diffusion, we share the denoised content among different views in each denoising step, specifically blending the latent content in the texture domain from views with overlap. Our method demonstrates superior performance in generating consistent, seamless, highly detailed textures, comparing to state-of-the-art methods.