Prompting Datasets: Data Discovery with Conversational Agents
December 15, 2023 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Johanna Walker, Elisavet Koutsiana, Joe Massey, Gefion Thuermer, Elena Simperl
arXiv ID
2312.09947
Category
cs.HC: Human-Computer Interaction
Citations
5
Venue
arXiv.org
Last Checked
4 months ago
Abstract
Can large language models assist in data discovery? Data discovery predominantly happens via search on a data portal or the web, followed by assessment of the dataset to ensure it is fit for the intended purpose. The ability of conversational generative AI (CGAI) to support recommendations with reasoning implies it can suggest datasets to users, explain why it has done so, and provide information akin to documentation regarding the dataset in order to support a use decision. We hold 3 workshops with data users and find that, despite limitations around web capabilities, CGAIs are able to suggest relevant datasets and provide many of the required sensemaking activities, as well as support dataset analysis and manipulation. However, CGAIs may also suggest fictional datasets, and perform inaccurate analysis. We identify emerging practices in data discovery and present a model of these to inform future research directions and data prompt design.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Human-Computer Interaction
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Improving fairness in machine learning systems: What do industry practitioners need?
R.I.P.
π»
Ghosted
Identifying Stable Patterns over Time for Emotion Recognition from EEG
R.I.P.
π»
Ghosted
Questioning the AI: Informing Design Practices for Explainable AI User Experiences
R.I.P.
π»
Ghosted
Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges and Opportunities
R.I.P.
π»
Ghosted
Educational data mining and learning analytics: An updated survey
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted