| 1 |
Reproducing DragDiffusion: Interactive Point-Based Editing with Diffusion Models
Ali Subhan, Ashir Raza
|
|
cs.CV
|
0 |
2 months ago |
| 2 |
REL-SF4PASS: Panoramic Semantic Segmentation with REL Depth Representation and Spherical Fusion
Xuewei Li, Xinghan Bao, ... (+2 more)
|
|
cs.CV
|
0 |
2 months ago |
| 3 |
ReWeaver: Towards Simulation-Ready and Topology-Accurate Garment Reconstruction
Ming Li, Hui Shan, ... (+6 more)
|
|
cs.CV
|
0 |
2 months ago |
| 4 |
GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering
Xincheng Shuai, Ziye Li, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 5 |
Seeing Beyond: Extrapolative Domain Adaptive Panoramic Segmentation
Yuanfan Zheng, Kunyu Peng, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 6 |
Flash-Unified: A Training-Free and Task-Aware Acceleration Framework for Native Unified Models
Junlong Ke, Zichen Wen, ... (+7 more)
|
|
cs.CV
|
0 |
1 month ago |
| 7 |
Efficient Document Parsing via Parallel Token Prediction
Lei Li, Ze Zhao, ... (+7 more)
|
|
cs.CL
|
0 |
1 month ago |
| 8 |
What Matters for Scalable and Robust Learning in End-to-End Driving Planners?
David Holtz, Niklas Hanselmann, ... (+3 more)
|
|
cs.RO
|
0 |
1 month ago |
| 9 |
ForceVLA2: Unleashing Hybrid Force-Position Control with Force Awareness for Contact-Rich Manipulation
Yang Li, Zhaxizhuoma, ... (+12 more)
|
|
cs.RO
|
0 |
1 month ago |
| 10 |
Question-guided Visual Compression with Memory Feedback for Long-Term Video Understanding
Sosuke Yamao, Natsuki Miyahara, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 11 |
GUI-CEval: A Hierarchical and Comprehensive Chinese Benchmark for Mobile GUI Agents
Yang Li, Yuchen Liu, ... (+9 more)
|
|
cs.CV
|
0 |
1 month ago |
| 12 |
Training-free Detection of Generated Videos via Spatial-Temporal Likelihoods
Omer Ben Hayun, Roy Betser, ... (+3 more)
|
|
cs.CV
|
0 |
1 month ago |
| 13 |
$\text{F}^2\text{HDR}$: Two-Stage HDR Video Reconstruction via Flow Adapter and Physical Motion Modeling
Huanjing Yue, Dawei Li, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 14 |
SpiralDiff: Spiral Diffusion with LoRA for RGB-to-RAW Conversion Across Cameras
Huanjing Yue, Shangbin Xie, ... (+5 more)
|
|
cs.CV
|
0 |
1 month ago |
| 15 |
LLMind: Bio-inspired Training-free Adaptive Visual Representations for Vision-Language Models
Soumyaratna Debnath, Bui Duc Manh, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 16 |
RealVLG-R1: A Large-Scale Real-World Visual-Language Grounding Benchmark for Robotic Perception and Manipulation
Linfei Li, Lin Zhang, Ying Shen
|
|
cs.CV
|
0 |
1 month ago |
| 17 |
RAZOR: Ratio-Aware Layer Editing for Targeted Unlearning in Vision Transformers and Diffusion Models
Ravi Ranjan, Utkarsh Grover, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 18 |
Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image
Joohyun Kwon, Geonhee Sim, Gyeongsik Moon
|
|
cs.CV
|
0 |
1 month ago |
| 19 |
PHAC: Promptable Human Amodal Completion
Seung Young Noh, Ju Yong Chang
|
|
cs.CV
|
0 |
1 month ago |
| 20 |
Enhancing Hands in 3D Whole-Body Pose Estimation with Conditional Hands Modulator
Gyeongsik Moon
|
|
cs.CV
|
0 |
1 month ago |
| 21 |
E2EGS: Event-to-Edge Gaussian Splatting for Pose-Free 3D Reconstruction
Yunsoo Kim, Changki Sung, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 22 |
LUMINA: A Multi-Vendor Mammography Benchmark with Energy Harmonization Protocol
Hongyi Pan, Gorkem Durak, ... (+10 more)
|
|
eess.IV
|
0 |
1 month ago |
| 23 |
RL-ScanIQA: Reinforcement-Learned Scanpaths for Blind 360°Image Quality Assessment
Yujia Wang, Yuyan Li, ... (+4 more)
|
|
cs.CV
|
0 |
1 month ago |
| 24 |
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization
Ngoc-Son Nguyen, Thanh V. T. Tran, ... (+4 more)
|
|
cs.CV
|
0 |
1 month ago |
| 25 |
Domain-Skewed Federated Learning with Feature Decoupling and Calibration
Huan Wang, Jun Shen, ... (+2 more)
|
|
cs.LG
|
0 |
1 month ago |
| 26 |
UniFusion: A Unified Image Fusion Framework with Robust Representation and Source-Aware Preservation
Xingyuan Li, Songcheng Du, ... (+4 more)
|
|
cs.CV
|
0 |
1 month ago |
| 27 |
BluRef: Unsupervised Image Deblurring with Dense-Matching References
Bang-Dang Pham, Anh Tran, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 28 |
Garments2Look: A Multi-Reference Dataset for High-Fidelity Outfit-Level Virtual Try-On with Clothing and Accessories
Junyao Hu, Zhongwei Cheng, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 29 |
PhyGaP: Physically-Grounded Gaussians with Polarization Cues
Jiale Wu, Xiaoyang Bai, ... (+3 more)
|
|
cs.CV
|
0 |
1 month ago |
| 30 |
Step-CoT: Stepwise Visual Chain-of-Thought for Medical Visual Question Answering
Lin Fan, Yafei Ou, ... (+9 more)
|
|
cs.CV
|
0 |
1 month ago |
| 31 |
Learning through Creation: A Hash-Free Framework for On-the-Fly Category Discovery
Bohan Zhang, Weidong Tang, ... (+5 more)
|
|
cs.CV
|
0 |
1 month ago |
| 32 |
Computation and Communication Efficient Federated Unlearning via On-server Gradient Conflict Mitigation and Expression
Minh-Duong Nguyen, Senura Hansaja, ... (+5 more)
|
|
cs.LG
|
0 |
1 month ago |
| 33 |
RetimeGS: Continuous-Time Reconstruction of 4D Gaussian Splatting
Xuezhen Wang, Li Ma, ... (+3 more)
|
|
cs.CV
|
0 |
1 month ago |
| 34 |
Ego-1K -- A Large-Scale Multiview Video Dataset for Egocentric Vision
Jae Yong Lee, Daniel Scharstein, ... (+12 more)
|
|
cs.CV
|
0 |
1 month ago |
| 35 |
Every Error has Its Magnitude: Asymmetric Mistake Severity Training for Multiclass Multiple Instance Learning
Sungrae Hong, Jiwon Jeong, ... (+5 more)
|
|
cs.CV
|
0 |
1 month ago |
| 36 |
Learning Generalizable 3D Medical Image Representations from Mask-Guided Self-Supervision
Yunhe Gao, Yabin Zhang, ... (+6 more)
|
|
cs.CV
|
0 |
1 month ago |
| 37 |
Deconstructing the Failure of Ideal Noise Correction: A Three-Pillar Diagnosis
Chen Feng, Zhuo Zhi, ... (+6 more)
|
|
cs.LG
|
0 |
1 month ago |
| 38 |
VIRD: View-Invariant Representation through Dual-Axis Transformation for Cross-View Pose Estimation
Juhye Park, Wooju Lee, ... (+5 more)
|
|
cs.CV
|
0 |
1 month ago |
| 39 |
Stake the Points: Structure-Faithful Instance Unlearning
Kiseong Hong, JungKyoo Shin, Eunwoo Kim
|
|
cs.CV
|
0 |
1 month ago |
| 40 |
Spectral-Geometric Neural Fields for Pose-Free LiDAR View Synthesis
Yinuo Jiang, Jun Cheng, ... (+2 more)
|
|
cs.CV
|
0 |
1 month ago |
| 41 |
Multimodal Protein Language Models for Enzyme Kinetic Parameters: From Substrate Recognition to Conformational Adaptation
Fei Wang, Xinye Zheng, ... (+6 more)
|
|
cs.CV
|
0 |
1 month ago |
| 42 |
coDrawAgents: A Multi-Agent Dialogue Framework for Compositional Image Generation
Chunhan Li, Qifeng Wu, ... (+8 more)
|
|
cs.CV
|
0 |
1 month ago |
| 43 |
SAVA-X: Ego-to-Exo Imitation Error Detection via Scene-Adaptive View Alignment and Bidirectional Cross View Fusion
Xiang Li, Heqian Qiu, ... (+5 more)
|
|
cs.CV
|
0 |
1 month ago |
| 44 |
HIFICL: High-Fidelity In-Context Learning for Multimodal Tasks
Xiaoyu Li, Yuhang Liu, ... (+5 more)
|
|
cs.CV
|
0 |
1 month ago |
| 45 |
HSEmotion Team at ABAW-10 Competition: Facial Expression Recognition, Valence-Arousal Estimation, Action Unit Detection and Fine-Grained Violence Classification
Andrey V. Savchenko, Kseniia Tsypliakova
|
|
cs.CV
|
0 |
1 month ago |
| 46 |
AVION: Aerial Vision-Language Instruction from Offline Teacher to Prompt-Tuned Network
Yu Hu, Jianyang Gu, ... (+5 more)
|
|
cs.CV
|
0 |
1 month ago |
| 47 |
A2Z-10M+: Geometric Deep Learning with A-to-Z BRep Annotations for AI-Assisted CAD Modeling and Reverse Engineering
Pritham Kumar Jena, Bhavika Baburaj, ... (+4 more)
|
|
cs.CV
|
0 |
1 month ago |
| 48 |
Do You See What I Am Pointing At? Gesture-Based Egocentric Video Question Answering
Yura Choi, Roy Miles, ... (+4 more)
|
|
cs.CV
|
0 |
1 month ago |
| 49 |
Revisiting Model Stitching In the Foundation Model Era
Zheda Mai, Ke Zhang, ... (+7 more)
|
|
cs.CV
|
0 |
1 month ago |
| 50 |
SPARROW: Learning Spatial Precision and Temporal Referential Consistency in Pixel-Grounded Video MLLMs
Mohamad Alansari, Naufal Suryanto, ... (+4 more)
|
|
cs.CV
|
0 |
1 month ago |