BEV-CV: Birds-Eye-View Transform for Cross-View Geo-Localisation

December 23, 2023 · Declared Dead · 🏛 IEEE/RJS International Conference on Intelligent RObots and Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Tavis Shore, Simon Hadfield, Oscar Mendez arXiv ID 2312.15363 Category cs.CV: Computer Vision Cross-listed cs.LG Citations 6 Venue IEEE/RJS International Conference on Intelligent RObots and Systems Last Checked 4 months ago

Abstract

Cross-view image matching for geo-localisation is a challenging problem due to the significant visual difference between aerial and ground-level viewpoints. The method provides localisation capabilities from geo-referenced images, eliminating the need for external devices or costly equipment. This enhances the capacity of agents to autonomously determine their position, navigate, and operate effectively in GNSS-denied environments. Current research employs a variety of techniques to reduce the domain gap such as applying polar transforms to aerial images or synthesising between perspectives. However, these approaches generally rely on having a 360° field of view, limiting real-world feasibility. We propose BEV-CV, an approach introducing two key novelties with a focus on improving the real-world viability of cross-view geo-localisation. Firstly bringing ground-level images into a semantic Birds-Eye-View before matching embeddings, allowing for direct comparison with aerial image representations. Secondly, we adapt datasets into application realistic format - limited Field-of-View images aligned to vehicle direction. BEV-CV achieves state-of-the-art recall accuracies, improving Top-1 rates of 70° crops of CVUSA and CVACT by 23% and 24% respectively. Also decreasing computational requirements by reducing floating point operations to below previous works, and decreasing embedding dimensionality by 33% - together allowing for faster localisation capabilities.