Scalability and Total Recall with Fast CoveringLSH

February 08, 2016 Β· Declared Dead Β· πŸ› International Conference on Information and Knowledge Management

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Ninh Pham, Rasmus Pagh arXiv ID 1602.02620 Category cs.DB: Databases Cross-listed cs.DS, cs.IR Citations 17 Venue International Conference on Information and Knowledge Management Last Checked 3 months ago
Abstract
Locality-sensitive hashing (LSH) has emerged as the dominant algorithmic technique for similarity search with strong performance guarantees in high-dimensional spaces. A drawback of traditional LSH schemes is that they may have \emph{false negatives}, i.e., the recall is less than 100\%. This limits the applicability of LSH in settings requiring precise performance guarantees. Building on the recent theoretical "CoveringLSH" construction that eliminates false negatives, we propose a fast and practical covering LSH scheme for Hamming space called \emph{Fast CoveringLSH (fcLSH)}. Inheriting the design benefits of CoveringLSH our method avoids false negatives and always reports all near neighbors. Compared to CoveringLSH we achieve an asymptotic improvement to the hash function computation time from $\mathcal{O}(dL)$ to $\mathcal{O}(d + L\log{L})$, where $d$ is the dimensionality of data and $L$ is the number of hash tables. Our experiments on synthetic and real-world data sets demonstrate that \emph{fcLSH} is comparable (and often superior) to traditional hashing-based approaches for search radius up to 20 in high-dimensional Hamming space.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Databases

Died the same way β€” πŸ‘» Ghosted