Classification of Infant Sleep/Wake States: Cross-Attention among Large Scale Pretrained Transformer Networks using Audio, ECG, and IMU Data

June 27, 2023 · Declared Dead · 🏛 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Kai Chieh Chang, Mark Hasegawa-Johnson, Nancy L. McElwain, Bashima Islam arXiv ID 2306.15808 Category cs.MM: Multimedia Cross-listed cs.SD, eess.AS, eess.SP Citations 7 Venue Asia-Pacific Signal and Information Processing Association Annual Summit and Conference Last Checked 3 months ago

Abstract

Infant sleep is critical to brain and behavioral development. Prior studies on infant sleep/wake classification have been largely limited to reliance on expensive and burdensome polysomnography (PSG) tests in the laboratory or wearable devices that collect single-modality data. To facilitate data collection and accuracy of detection, we aimed to advance this field of study by using a multi-modal wearable device, LittleBeats (LB), to collect audio, electrocardiogram (ECG), and inertial measurement unit (IMU) data among a cohort of 28 infants. We employed a 3-branch (audio/ECG/IMU) large scale transformer-based neural network (NN) to demonstrate the potential of such multi-modal data. We pretrained each branch independently with its respective modality, then finetuned the model by fusing the pretrained transformer layers with cross-attention. We show that multi-modal data significantly improves sleep/wake classification (accuracy = 0.880), compared with use of a single modality (accuracy = 0.732). Our approach to multi-modal mid-level fusion may be adaptable to a diverse range of architectures and tasks, expanding future directions of infant behavioral research.