Rethinking Dataset Discovery with DataScout

July 25, 2025 Β· Declared Dead Β· πŸ› ACM Symposium on User Interface Software and Technology

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Rachel Lin, Bhavya Chopra, Wenjing Lin, Shreya Shankar, Madelon Hulsebos, Aditya G. Parameswaran arXiv ID 2507.18971 Category cs.HC: Human-Computer Interaction Citations 0 Venue ACM Symposium on User Interface Software and Technology Last Checked 4 months ago
Abstract
Dataset Search -- the process of finding appropriate datasets for a given task -- remains a critical yet under-explored challenge in data science workflows. Assessing dataset suitability for a task (e.g., training a classification model) is a multi-pronged affair that involves understanding: data characteristics (e.g. granularity, attributes, size), semantics (e.g., data semantics, creation goals), and relevance to the task at hand. Present-day dataset search interfaces are restrictive -- users struggle to convey implicit preferences and lack visibility into the search space and result inclusion criteria -- making query iteration challenging. To bridge these gaps, we introduce DataScout to proactively steer users through the process of dataset discovery via -- (i) AI-assisted query reformulations informed by the underlying search space, (ii) semantic search and filtering based on dataset content, including attributes (columns) and granularity (rows), and (iii) dataset relevance indicators, generated dynamically based on the user-specified task. A within-subjects study with 12 participants comparing DataScout to keyword and semantic dataset search reveals that users uniquely employ DataScout's features not only for structured explorations, but also to glean feedback on their search queries and build conceptual models of the search space.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Human-Computer Interaction

Died the same way β€” πŸ‘» Ghosted