SoundCam: A Dataset for Finding Humans Using Room Acoustics
November 06, 2023 ยท Entered Twilight ยท ๐ Neural Information Processing Systems
"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"
Evidence collected by the PWNC Scanner
Repo contents: README.md, SoundCamConsentForm.pdf, VGGish, datasets, download.sh, examples.ipynb, indices, labels, posekernel
Authors
Mason Wang, Samuel Clarke, Jui-Hsien Wang, Ruohan Gao, Jiajun Wu
arXiv ID
2311.03517
Category
cs.SD: Sound
Cross-listed
cs.CV,
eess.AS
Citations
11
Venue
Neural Information Processing Systems
Repository
https://github.com/maswang32/soundcam
โญ 3
Last Checked
1 month ago
Abstract
A room's acoustic properties are a product of the room's geometry, the objects within the room, and their specific positions. A room's acoustic properties can be characterized by its impulse response (RIR) between a source and listener location, or roughly inferred from recordings of natural signals present in the room. Variations in the positions of objects in a room can effect measurable changes in the room's acoustic properties, as characterized by the RIR. Existing datasets of RIRs either do not systematically vary positions of objects in an environment, or they consist of only simulated RIRs. We present SoundCam, the largest dataset of unique RIRs from in-the-wild rooms publicly released to date. It includes 5,000 10-channel real-world measurements of room impulse responses and 2,000 10-channel recordings of music in three different rooms, including a controlled acoustic lab, an in-the-wild living room, and a conference room, with different humans in positions throughout each room. We show that these measurements can be used for interesting tasks, such as detecting and identifying humans, and tracking their positions.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Sound
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
CNN Architectures for Large-Scale Audio Classification
R.I.P.
๐ป
Ghosted
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
R.I.P.
๐ป
Ghosted
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R.I.P.
๐ป
Ghosted