Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions

October 17, 2024 · Declared Dead · 🏛 International Conference on Learning Representations

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Michael J. Q. Zhang, W. Bradley Knox, Eunsol Choi arXiv ID 2410.13788 Category cs.CL: Computation & Language Citations 40 Venue International Conference on Learning Representations Last Checked 4 months ago

Abstract

Large language models (LLMs) must often respond to highly ambiguous user requests. In such cases, the LLM's best response may be to ask a clarifying question to elicit more information. Existing LLMs often respond by presupposing a single interpretation of such ambiguous requests, frustrating users who intended a different interpretation. We speculate this is caused by current preference data labeling practice, where LLM responses are evaluated only on their prior contexts. To address this, we assign preference labels by simulating their expected outcomes in future turns. This allows LLMs to learn to ask clarifying questions when it can generate responses that are tailored to each user interpretation in future turns. On open-domain QA datasets with multiple annotations, we evaluate systems based on their ability to ask clarifying questions to recover each user's interpretation and expected answer. We compare systems trained using our proposed preference labeling methods against standard methods, which assign preferences based on only prior context. Our method achieves a 5% improvement in F1 measured against the answer set from different interpretations of each query, showing the value of modeling future conversation turns. We further demonstrate that our method can be used to train models to judiciously determine when to ask clarifying questions, directly answering the question when clarification is unnecessary. In our experiments, we find that our method achieves a 3% improvement in accuracy of such judgments over existing methods.