Finding Favourite Tuples on Data Streams with Provably Few Comparisons

July 06, 2023 Β· Declared Dead Β· πŸ› Knowledge Discovery and Data Mining

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Guangyi Zhang, Nikolaj Tatti, Aristides Gionis arXiv ID 2307.02946 Category cs.DB: Databases Cross-listed cs.DS Citations 3 Venue Knowledge Discovery and Data Mining Last Checked 4 months ago
Abstract
One of the most fundamental tasks in data science is to assist a user with unknown preferences in finding high-utility tuples within a large database. To accurately elicit the unknown user preferences, a widely-adopted way is by asking the user to compare pairs of tuples. In this paper, we study the problem of identifying one or more high-utility tuples by adaptively receiving user input on a minimum number of pairwise comparisons. We devise a single-pass streaming algorithm, which processes each tuple in the stream at most once, while ensuring that the memory size and the number of requested comparisons are in the worst case logarithmic in $n$, where $n$ is the number of all tuples. An important variant of the problem, which can help to reduce human error in comparisons, is to allow users to declare ties when confronted with pairs of tuples of nearly equal utility. We show that the theoretical guarantees of our method can be maintained for this important problem variant. In addition, we show how to enhance existing pruning techniques in the literature by leveraging powerful tools from mathematical programming. Finally, we systematically evaluate all proposed algorithms over both synthetic and real-life datasets, examine their scalability, and demonstrate their superior performance over existing methods.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Databases

Died the same way β€” πŸ‘» Ghosted