CCM: Adding Conditional Controls to Text-to-Image Consistency Models
December 12, 2023 ยท Entered Twilight ยท ๐ arXiv.org
"No code URL or promise found in abstract"
"Derived repo from GitHub Pages (backfill)"
Evidence collected by the PWNC Scanner
Repo contents: .gitattributes, index.html, static
Authors
Jie Xiao, Kai Zhu, Han Zhang, Zhiheng Liu, Yujun Shen, Yu Liu, Xueyang Fu, Zheng-Jun Zha
arXiv ID
2312.06971
Category
cs.CV: Computer Vision
Citations
11
Venue
arXiv.org
Repository
https://github.com/swiftforce/CCM
Last Checked
1 month ago
Abstract
Consistency Models (CMs) have showed a promise in creating visual content efficiently and with high quality. However, the way to add new conditional controls to the pretrained CMs has not been explored. In this technical report, we consider alternative strategies for adding ControlNet-like conditional control to CMs and present three significant findings. 1) ControlNet trained for diffusion models (DMs) can be directly applied to CMs for high-level semantic controls but struggles with low-level detail and realism control. 2) CMs serve as an independent class of generative models, based on which ControlNet can be trained from scratch using Consistency Training proposed by Song et al. 3) A lightweight adapter can be jointly optimized under multiple conditions through Consistency Training, allowing for the swift transfer of DMs-based ControlNet to CMs. We study these three solutions across various conditional controls, including edge, depth, human pose, low-resolution image and masked image with text-to-image latent consistency models.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
R.I.P.
๐ป
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
R.I.P.
๐ป
Ghosted