AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

November 26, 2024 · Declared Dead · 🏛 International Conference on Machine Learning

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Alexander Capstick, Rahul G. Krishnan, Payam Barnaghi arXiv ID 2411.17284 Category cs.LG: Machine Learning Cross-listed cs.CL, stat.ML Citations 7 Venue International Conference on Machine Learning Last Checked 4 months ago

Abstract

Large language models (LLMs) acquire a breadth of information across various domains. However, their computational complexity, cost, and lack of transparency often hinder their direct application for predictive tasks where privacy and interpretability are paramount. In fields such as healthcare, biology, and finance, specialised and interpretable linear models still hold considerable value. In such domains, labelled data may be scarce or expensive to obtain. Well-specified prior distributions over model parameters can reduce the sample complexity of learning through Bayesian inference; however, eliciting expert priors can be time-consuming. We therefore introduce AutoElicit to extract knowledge from LLMs and construct priors for predictive models. We show these priors are informative and can be refined using natural language. We perform a careful study contrasting AutoElicit with in-context learning and demonstrate how to perform model selection between the two methods. We find that AutoElicit yields priors that can substantially reduce error over uninformative priors, using fewer labels, and consistently outperform in-context learning. We show that AutoElicit saves over 6 months of labelling effort when building a new predictive model for urinary tract infections from sensor recordings of people living with dementia.