Fast Prefix Search in Little Space, with Applications
April 12, 2018 Β· Declared Dead Β· π Embedded Systems and Applications
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Djamal Belazzougui, Paolo Boldi, Rasmus Pagh, Sebastiano Vigna
arXiv ID
1804.04720
Category
cs.DS: Data Structures & Algorithms
Citations
59
Venue
Embedded Systems and Applications
Last Checked
3 months ago
Abstract
It has been shown in the indexing literature that there is an essential difference between prefix/range searches on the one hand, and predecessor/rank searches on the other hand, in that the former provably allows faster query resolution. Traditionally, prefix search is solved by data structures that are also dictionaries---they actually contain the strings in $S$. For very large collections stored in slow-access memory, we propose much more compact data structures that support \emph{weak} prefix searches---they return the ranks of matching strings provided that \emph{some} string in $S$ starts with the given prefix. In fact, we show that our most space-efficient data structure is asymptotically space-optimal. Previously, data structures such as String B-trees (and more complicated cache-oblivious string data structures) have implicitly supported weak prefix queries, but they all have query time that grows logarithmically with the size of the string collection. In contrast, our data structures are simple, naturally cache-efficient, and have query time that depends only on the length of the prefix, all the way down to constant query time for strings that fit in one machine word. We give several applications of weak prefix searches, including exact prefix counting and approximate counting of tuples matching conjunctive prefix conditions.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted