Explaining Wrong Queries Using Small Examples
April 09, 2019 ยท Declared Dead ยท ๐ SIGMOD Conference
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Zhengjie Miao, Sudeepa Roy, Jun Yang
arXiv ID
1904.04467
Category
cs.DB: Databases
Citations
41
Venue
SIGMOD Conference
Last Checked
2 months ago
Abstract
For testing the correctness of SQL queries, e.g., evaluating student submissions in a database course, a standard practice is to execute the query in question on some test database instance and compare its result with that of the correct query. Given two queries $Q_1$ and $Q_2$, we say that a database instance $D$ is a counterexample (for $Q_1$ and $Q_2$) if $Q_1(D)$ differs from $Q_2(D)$; such a counterexample can serve as an explanation of why $Q_1$ and $Q_2$ are not equivalent. While the test database instance may serve as a counterexample, it may be too large or complex to read and understand where the inequivalence comes from. Therefore, in this paper, given a known counterexample $D$ for $Q_1$ and $Q_2$, we aim to find the smallest counterexample $D' \subseteq D$ where $Q_1(D') \neq Q_2(D')$. The problem in general is NP-hard. We give a suite of algorithms for finding the smallest counterexample for different classes of queries, some more tractable than others. We also present an efficient provenance-based algorithm for SPJUD queries that uses a constraint solver, and extend it to more complex queries with aggregation, group-by, and nested queries. We perform extensive experiments indicating the effectiveness and scalability of our solution on student queries from an undergraduate database course and on queries from the TPC-H benchmark. We also report a user study from the course where we deployed our tool to help students with an assignment on relational algebra.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Databases
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
The Case for Learned Index Structures
R.I.P.
๐ป
Ghosted
Untangling Blockchain: A Data Processing View of Blockchain Systems
R.I.P.
๐ป
Ghosted
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades
R.I.P.
๐ป
Ghosted
BLOCKBENCH: A Framework for Analyzing Private Blockchains
R.I.P.
๐ป
Ghosted
Data Synthesis based on Generative Adversarial Networks
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted