Hardware System Overview
Narration included.
Hardware System Overview
Exploded View
Eccentric Force Sensing
Lateral Force Sensing
Light Robustness(50% Cover)
Light Robustness(100% Cover)
Each clip shows one target object in the delicate grasp setting.
Balloon
Chips Bag
Cones
Cookies
Eggs
Grape
Paper
Paper Cup
Pencil
Seaweed
Adding full binocular FingerEye to wrist vision raises mean success from 26.7% to 65.9% in simulation and from 37.5% to 71.3% in the real world.
Full binocular FingerEye outperforms the monocular variant in mean success: 65.9% vs. 59.1% in simulation and 71.3% vs. 56.3% in the real world.
Post-contact tactile maps stay near wrist-only performance in simulation mean success, 24.1% vs. 26.7%, while wrist + FingerEye reaches 65.9%.
GEnc+GDec reaches 65.9% simulation mean success, above GEnc+FDec at 59.8% and NoEnc+FDec, the strongest non-grouped baseline, at 52.0%.
Dexterous robotic manipulation requires perception that remains informative from pre-contact approach to contact initiation and post-contact control. We introduce FingerEye, a sensing and learning framework that strengthens robotic dexterity through continuous vision-tactile feedback throughout interaction. On the sensing side, FingerEye integrates binocular RGB cameras with a compliant contact interface to support perception both before and after contact. Before contact, the fingertip cameras provide close-range visual cues and implicit stereo for precise approach and object localization. After contact, marker-tracked deformation of the compliant ring provides a proxy for contact wrench sensing. On the learning side, we build real-and-sim infrastructure for data collection and evaluation, systematically study policy-interface designs for learning with multiple FingerEye sensors, and develop FingerEye Policy, which applies group-structured modality fusion to reduce modality shortcuts and better exploit distributed fingertip feedback. Across seven contact-sensitive task settings, FingerEye improves wrist-only policy by over \(30\) percentage points in mean success rate in both simulation and the real world.
@misc{xu2026fingereyecontinuousunifiedvisiontactile,
title={FingerEye: Learning Dexterous Manipulation with Continuous Vision-Tactile Sensing},
author={Zhixuan Xu and Yichen Li and Xuanye Wu and Tianyu Qiu and Lin Shao},
year={2026},
eprint={2604.20689},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2604.20689},
}