VisemeNet: Audio-Driven Animator-Centric Speech Animation

May 24, 2018 · Declared Dead · 🏛 ACM Transactions on Graphics

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Yang Zhou, Zhan Xu, Chris Landreth, Evangelos Kalogerakis, Subhransu Maji, Karan Singh arXiv ID 1805.09488 Category cs.GR: Graphics Citations 159 Venue ACM Transactions on Graphics Last Checked 1 month ago

Abstract

We present a novel deep-learning based approach to producing animator-centric speech motion curves that drive a JALI or standard FACS-based production face-rig, directly from input audio. Our three-stage Long Short-Term Memory (LSTM) network architecture is motivated by psycho-linguistic insights: segmenting speech audio into a stream of phonetic-groups is sufficient for viseme construction; speech styles like mumbling or shouting are strongly co-related to the motion of facial landmarks; and animator style is encoded in viseme motion curve profiles. Our contribution is an automatic real-time lip-synchronization from audio solution that integrates seamlessly into existing animation pipelines. We evaluate our results by: cross-validation to ground-truth data; animator critique and edits; visual comparison to recent deep-learning lip-synchronization solutions; and showing our approach to be resilient to diversity in speaker and language.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Graphics

R.I.P. 👻 Ghosted

ShapeNet: An Information-Rich 3D Model Repository

Angel X. Chang, Thomas Funkhouser, ... (+11 more)

cs.GR 🏛 arXiv 📚 6.2K cites 10 years ago

R.I.P. 👻 Ghosted

Everybody Dance Now

Caroline Chan, Shiry Ginosar, ... (+2 more)

cs.GR 🏛 ICCV 📚 820 cites 7 years ago

R.I.P. 👻 Ghosted

Deep Bilateral Learning for Real-Time Image Enhancement

Michaël Gharbi, Jiawen Chen, ... (+3 more)

cs.GR 🏛 ACM TOG 📚 800 cites 8 years ago

R.I.P. 👻 Ghosted

Animating Human Athletics

Jessica K. Hodgins, Wayne L. Wooten, ... (+2 more)

cs.GR 🏛 SIGGRAPH 📚 765 cites 3 years ago

R.I.P. 👻 Ghosted

BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration

Angela Dai, Matthias Nießner, ... (+3 more)

cs.GR 🏛 TOGS 📚 595 cites 10 years ago

R.I.P. 👻 Ghosted

Shape Transformation Using Variational Implicit Functions

Greg Turk, James F. O'Brien

cs.GR 🏛 SIGGRAPH Courses 📚 581 cites 3 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago