Dimensionality-Reduction Techniques for Approximate Nearest Neighbor Search: A Survey and Evaluation

March 20, 2024 ยท The Cartographer ยท ๐Ÿ› IEEE Data Engineering Bulletin

๐Ÿ“š THE CARTOGRAPHER: The Cartographer
Survey/review paper โ€” maps the landscape rather than implementing a method.

"No code URL or promise found in abstract"
"Title-pattern auto-detect: Dimensionality-Reduction Techniques for Approximate Nearest Neighbor Search: A Survey and Evaluation"

Evidence collected by the PWNC Scanner

Authors Zeyu Wang, Haoran Xiong, Qitong Wang, Zhenying He, Peng Wang, Themis Palpanas, Wei Wang arXiv ID 2403.13491 Category cs.DB: Databases Citations 3 Venue IEEE Data Engineering Bulletin Last Checked 4 days ago
Abstract
Approximate Nearest Neighbor Search (ANNS) on high-dimensional vectors has become a fundamental and essential component in various machine learning tasks. Recently, with the rapid development of deep learning models and the applications of Large Language Models (LLMs), the dimensionality of the vectors keeps growing in order to accommodate a richer semantic representation. This poses a major challenge to the ANNS solutions since distance calculation cost in ANNS grows linearly with the dimensionality of vectors. To overcome this challenge, dimensionality-reduction techniques can be leveraged to accelerate the distance calculation in the search process. In this paper, we investigate six dimensionality-reduction techniques that have the potential to improve ANNS solutions, including classical algorithms such as PCA and vector quantization, as well as algorithms based on deep learning approaches. We further describe two frameworks to apply these techniques in the ANNS workflow, and theoretically analyze the time and space costs, as well as the beneficial threshold for the pruning ratio of these techniques. The surveyed techniques are evaluated on six public datasets. The analysis of the results reveals the characteristics of the different families of techniques and provides insights into the promising future research directions.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Databases