Building and Evaluating a Realistic Virtual World for Large Scale Urban Exploration from 360Β° Videos
October 13, 2025 Β· Declared Dead Β· π Multimedia tools and applications
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Mizuki Takenawa, Naoki Sugimoto, Leslie WΓΆhler, Satoshi Ikehata, Kiyoharu Aizawa
arXiv ID
2510.11447
Category
cs.MM: Multimedia
Citations
1
Venue
Multimedia tools and applications
Last Checked
4 months ago
Abstract
We propose to build realistic virtual worlds, called 360RVW, for large urban environments directly from 360Β° videos. We provide an interface for interactive exploration, where users can freely navigate via their own avatars. 360Β° videos record the entire environment of the shooting location simultaneously leading to highly realistic and immersive representations. Our system uses 360Β° videos recorded along streets and builds a 360RVW through four main operations: video segmentation by intersection detection, video completion to remove the videographer, semantic segmentation for virtual collision detection with the avatar, and projection onto a distorted sphere that moves along the camera trajectory following the avatar's movements. Our interface allows users to explore large urban environments by changing their walking direction at intersections or choosing a new location by clicking on a map. Even without a 3D model, the users can experience collision with buildings using metadata produced by semantic segmentation. Furthermore, we stream the 360Β° videos so users can directly access 360RVW via their web browser. We fully evaluate our system, including a perceptual experiment comparing our approach to previous exploratory interfaces. The results confirm the quality of our system, especially regarding the presence of users and the interactive exploration, making it most suitable for a virtual tour of urban environments.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Multimedia
π
π
Old Age
R.I.P.
π»
Ghosted
Viewport-Adaptive Navigable 360-Degree Video Delivery
π
π
The Cartographer
A Comprehensive Survey on Cross-modal Retrieval
π
π
The Cartographer
An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges
R.I.P.
π»
Ghosted
A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding
R.I.P.
π»
Ghosted
Video Generation From Text
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted