⚰️ Sound

R.I.P. 👻 Ghosted

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

Santiago Pascual, Maruchan Park, ... (+3 more)

cs.SD 🏛 ICASSP 📚 28 cites 8 years ago

R.I.P. 👻 Ghosted

BigEAR: Inferring the Ambient and Emotional Correlates from Smartphone-based Acoustic Big Data

Harishchandra Dubey, Matthias R. Mehl, Kunal Mankodiya

cs.SD 🏛 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies 📚 28 cites 9 years ago

R.I.P. 👻 Ghosted

CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models

Hao-Wen Dong, Xiaoyu Liu, ... (+6 more)

cs.SD 🏛 ICASPAA W 📚 27 cites 2 years ago

R.I.P. 👻 Ghosted

AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN

Zewang Zhang, Qiao Tian, ... (+3 more)

cs.SD 🏛 arXiv 📚 27 cites 5 years ago

R.I.P. 👻 Ghosted

The Bach Doodle: Approachable music composition with machine learning at scale

Cheng-Zhi Anna Huang, Curtis Hawthorne, ... (+5 more)

cs.SD 🏛 ISMIR 📚 27 cites 6 years ago

R.I.P. 👻 Ghosted

MuSE-ing on the Impact of Utterance Ordering On Crowdsourced Emotion Annotations

Mimansa Jaiswal, Zakaria Aldeneh, ... (+5 more)

cs.SD 🏛 ICASSP 📚 27 cites 7 years ago

R.I.P. 👻 Ghosted

Improved Chord Recognition by Combining Duration and Harmonic Language Models

Filip Korzeniowski, Gerhard Widmer

cs.SD 🏛 ISMIR 📚 27 cites 7 years ago

R.I.P. 👻 Ghosted

Generating music with sentiment using Transformer-GANs

Pedro Neves, Jose Fornari, João Florindo

cs.SD 🏛 ISMIR 📚 27 cites 3 years ago

R.I.P. 👻 Ghosted

Unified Mandarin TTS Front-end Based on Distilled BERT Model

Yang Zhang, Liqun Deng, Yasheng Wang

cs.SD 🏛 arXiv 📚 26 cites 5 years ago

R.I.P. 👻 Ghosted

Towards Robust Neural Vocoding for Speech Generation: A Survey

Po-chun Hsu, Chun-hsuan Wang, ... (+2 more)

cs.SD 🏛 arXiv 📚 26 cites 6 years ago

R.I.P. 👻 Ghosted

STC Speaker Recognition Systems for the VOiCES From a Distance Challenge

Sergey Novoselov, Aleksei Gusev, ... (+6 more)

cs.SD 🏛 Interspeech 📚 26 cites 7 years ago

R.I.P. 👻 Ghosted

A Hybrid Approach with Multi-channel I-Vectors and Convolutional Neural Networks for Acoustic Scene Classification

Hamid Eghbal-zadeh, Bernhard Lehner, ... (+2 more)

cs.SD 🏛 EUSIPCO 📚 26 cites 8 years ago

R.I.P. 👻 Ghosted

Probabilistic Binary-Mask Cocktail-Party Source Separation in a Convolutional Deep Neural Network

Andrew J. R. Simpson

cs.SD 🏛 arXiv 📚 26 cites 11 years ago

R.I.P. 👻 Ghosted

ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models

Pengfei Zhu, Chao Pang, ... (+6 more)

cs.SD 🏛 arXiv 📚 26 cites 3 years ago

R.I.P. 👻 Ghosted

Fall Detection from Audios with Audio Transformers

Prabhjot Kaur, Qifan Wang, Weisong Shi

cs.SD 🏛 Smart Health 📚 26 cites 3 years ago

R.I.P. 👻 Ghosted

A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition

Shentong Mo, Pedro Morgado

cs.SD 🏛 ICML 📚 25 cites 2 years ago

R.I.P. 👻 Ghosted

Between Homomorphic Signal Processing and Deep Neural Networks: Constructing Deep Algorithms for Polyphonic Music Transcription

Li Su

cs.SD 🏛 APSIPA 📚 25 cites 8 years ago

R.I.P. 👻 Ghosted

Histogram of gradients of Time-Frequency Representations for Audio scene detection

Alain Rakotomamonjy, Gilles Gasso

cs.SD 🏛 arXiv 📚 25 cites 10 years ago

R.I.P. 👻 Ghosted

Source localization and denoising: a perspective from the TDOA space

Marco Compagnoni, Antonio Canclini, ... (+4 more)

cs.SD 🏛 Multidimensional systems and signal processing 📚 25 cites 10 years ago

R.I.P. 👻 Ghosted

Multi-source Domain Adaptation for Text-independent Forensic Speaker Recognition

Zhenyu Wang, John H. L. Hansen

cs.SD 🏛 IEEE/ACM TASLP 📚 25 cites 3 years ago

R.I.P. 👻 Ghosted

ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS

Liumeng Xue, Frank K. Soong, ... (+2 more)

cs.SD 🏛 IEEE/ACM TASLP 📚 25 cites 3 years ago

R.I.P. 👻 Ghosted

Multimodal Fish Feeding Intensity Assessment in Aquaculture

Meng Cui, Xubo Liu, ... (+6 more)

cs.SD 🏛 IEEE TASE 📚 24 cites 2 years ago

R.I.P. 👻 Ghosted

V2Meow: Meowing to the Visual Beat via Video-to-Music Generation

Kun Su, Judith Yue Li, ... (+9 more)

cs.SD 🏛 AAAI 📚 24 cites 2 years ago

R.I.P. 👻 Ghosted

Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data

Thibault Doutre, Wei Han, ... (+8 more)

cs.SD 🏛 ICASSP 📚 24 cites 5 years ago

🏛️ The Sound Crypt