Speak or Stay Silent: Context-Aware Turn-Taking in Multi-Party Dialogue

March 12, 2026 Β· Grace Period Β· πŸ› Interspeech 2026

⏳ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Kratika Bhagtani, Mrinal Anand, Yu Chen Xu, Amit Kumar Singh Yadav arXiv ID 2603.11409 Category cs.AI: Artificial Intelligence Cross-listed cs.CL Citations 0 Venue Interspeech 2026
Abstract
Existing voice AI assistants treat every detected pause as an invitation to speak. This works in dyadic dialogue, but in multi-party settings, where an AI assistant participates alongside multiple speakers, pauses are abundant and ambiguous. An assistant that speaks on every pause becomes disruptive rather than useful. In this work, we formulate context-aware turn-taking: at every detected pause, given the full conversation context, our method decides whether the assistant should speak or stay silent. We introduce a benchmark of over 120K labeled conversations spanning three multi-party corpora. Evaluating eight recent large language models, we find that they consistently fail at context-aware turn-taking under zero-shot prompting. We then propose a supervised fine-tuning approach with reasoning traces, improving balanced accuracy by up to 23 percentage points. Our findings suggest that context-aware turn-taking is not an emergent capability; it must be explicitly trained.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Artificial Intelligence