⚰️ Sound

R.I.P. 👻 Ghosted

Attention-based End-to-End Models for Small-Footprint Keyword Spotting

Changhao Shan, Junbo Zhang, ... (+2 more)

cs.SD 🏛 Interspeech 📚 112 cites 8 years ago

R.I.P. 👻 Ghosted

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Xiong Wang, Yangze Li, ... (+6 more)

cs.SD 🏛 ICML 📚 112 cites 1 year ago

R.I.P. 👻 Ghosted

Indoor Sound Source Localization with Probabilistic Neural Network

Yingxiang Sun, Jiajia Chen, ... (+2 more)

cs.SD 🏛 IEEE transactions on industrial electronics (1982. Print) 📚 110 cites 8 years ago

R.I.P. 👻 Ghosted

The Jazz Transformer on the Front Line: Exploring the Shortcomings of AI-composed Music through Quantitative Measures

Shih-Lun Wu, Yi-Hsuan Yang

cs.SD 🏛 ISMIR 📚 108 cites 5 years ago

R.I.P. 👻 Ghosted

The Deterministic plus Stochastic Model of the Residual Signal and its Applications

Thomas Drugman, Thierry Dutoit

cs.SD 🏛 IEEE TASLP 📚 107 cites 6 years ago

R.I.P. 👻 Ghosted

TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer

Sicong Huang, Qiyang Li, ... (+4 more)

cs.SD 🏛 ICLR 📚 107 cites 7 years ago

R.I.P. 👻 Ghosted

A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI

Chenshuang Zhang, Chaoning Zhang, ... (+5 more)

cs.SD 🏛 arXiv 📚 107 cites 3 years ago

R.I.P. 👻 Ghosted

Convolutional Recurrent Neural Networks for Bird Audio Detection

EmreÇakır, Sharath Adavanne, ... (+3 more)

cs.SD 🏛 EUSIPCO 📚 105 cites 9 years ago

R.I.P. 👻 Ghosted

Convolutional Gated Recurrent Neural Network Incorporating Spatial Features for Audio Tagging

Yong Xu, Qiuqiang Kong, ... (+3 more)

cs.SD 🏛 IJCNN 📚 105 cites 9 years ago

R.I.P. 👻 Ghosted

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

Zhihao Du, Jiaming Wang, ... (+13 more)

cs.SD 🏛 arXiv 📚 104 cites 2 years ago

R.I.P. 👻 Ghosted

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation

Zhong-Qiu Wang, Peidong Wang, DeLiang Wang

cs.SD 🏛 IEEE/ACM TASLP 📚 104 cites 5 years ago

R.I.P. 👻 Ghosted

AI Song Contest: Human-AI Co-Creation in Songwriting

Cheng-Zhi Anna Huang, Hendrik Vincent Koops, ... (+3 more)

cs.SD 🏛 ISMIR 📚 104 cites 5 years ago

R.I.P. 👻 Ghosted

MMM : Exploring Conditional Multi-Track Music Generation with the Transformer

Jeff Ens, Philippe Pasquier

cs.SD 🏛 arXiv 📚 104 cites 5 years ago

R.I.P. 👻 Ghosted

Variational Autoencoders for Learning Latent Representations of Speech Emotion: A Preliminary Study

Siddique Latif, Rajib Rana, ... (+2 more)

cs.SD 🏛 Interspeech 📚 104 cites 8 years ago

R.I.P. 👻 Ghosted

Virufy: Global Applicability of Crowdsourced and Clinical Datasets for AI Detection of COVID-19 from Cough

Gunvant Chaudhari, Xinyi Jiang, ... (+5 more)

cs.SD 🏛 arXiv 📚 102 cites 5 years ago

R.I.P. 👻 Ghosted

Deep Karaoke: Extracting Vocals from Musical Mixtures Using a Convolutional Deep Neural Network

Andrew J. R. Simpson, Gerard Roma, Mark D. Plumbley

cs.SD 🏛 Latent Variable Analysis and Signal Separation 📚 102 cites 11 years ago

R.I.P. 👻 Ghosted

A Deterministic plus Stochastic Model of the Residual Signal for Improved Parametric Speech Synthesis

Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit

cs.SD 🏛 Interspeech 📚 101 cites 6 years ago

R.I.P. 👻 Ghosted

Recognizing Multi-talker Speech with Permutation Invariant Training

Dong Yu, Xuankai Chang, Yanmin Qian

cs.SD 🏛 Interspeech 📚 101 cites 9 years ago

R.I.P. 👻 Ghosted

Encoding Musical Style with Transformer Autoencoders

Kristy Choi, Curtis Hawthorne, ... (+3 more)

cs.SD 🏛 ICML 📚 100 cites 6 years ago

R.I.P. 👻 Ghosted

Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective

Zhong-Qiu Wang, Ke Tan, DeLiang Wang

cs.SD 🏛 ICASSP 📚 99 cites 7 years ago

R.I.P. 👻 Ghosted

A Neural Parametric Singing Synthesizer

Merlijn Blaauw, Jordi Bonada

cs.SD 🏛 Interspeech 📚 99 cites 9 years ago

🌅 💤 Eternal Rest

Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation

Yuan Gan, Zongxin Yang, ... (+3 more)

cs.SD 🏛 ICCV 📚 98 cites 2 years ago

R.I.P. 👻 Ghosted

Sample-level CNN Architectures for Music Auto-tagging Using Raw Waveforms

Taejun Kim, Jongpil Lee, Juhan Nam

cs.SD 🏛 ICASSP 📚 98 cites 8 years ago

R.I.P. 👻 Ghosted

Robust sound event detection in bioacoustic sensor networks

Vincent Lostanlen, Justin Salamon, ... (+3 more)

cs.SD 🏛 PLoS ONE 📚 97 cites 6 years ago

🏛️ The Sound Crypt