Highlights:
- Introduces a radar-camera fused Multi-Object Tracking (MOT) system with online calibration.
- Positions radar as a primary sensor rather than a supplementary one in sensor fusion tasks.
- Uses common features between radar and camera for accurate 3D positioning and tracking.
- Reduces manual calibration effort through automated online association techniques.
TLDR:
Researchers Lei Cheng and Siyang Cao have developed an advanced radar-camera fusion framework for Multi-Object Tracking that autonomously calibrates sensors online using common features, substantially improving tracking accuracy and efficiency for intelligent transportation systems.
In a major leap for computer vision and intelligent transportation systems, researchers Lei Cheng and Siyang Cao have unveiled a breakthrough framework titled ‘Radar-Camera Fused Multi-Object Tracking: Online Calibration and Common Feature’. The study, recently accepted by the IEEE Transactions on Intelligent Transportation Systems, redefines how radar and camera data can be fused for more accurate and efficient multi-object tracking (MOT) applications. Unlike traditional fusion models that treat radar as a secondary source, this innovative system integrates radar as a central component for precise range and depth estimation, drastically reducing dependence on manual sensor calibration.
The researchers focus on solving a key challenge in sensor fusion: the alignment of radar and camera detections in real-world conditions where sensor parameters can drift over time. Their method introduces an online calibration process powered by common feature extraction from both sensors. By identifying mutual attributes in radar reflections and visual imagery, the system autonomously calibrates the sensors in real time. This capability enables dynamic adaptability in unpredictable environments, such as traffic intersections or urban streets, where environmental or mechanical shifts can otherwise compromise tracking consistency.
Technically, the proposed framework advances beyond conventional position-based matching. It employs feature matching algorithms and category-consistency verification to ensure that detected entities from radar and camera not only align spatially but also semantically. This multi-level verification enhances association accuracy between different sensor modalities. The implementation includes a radar-camera mapping function embedded within an MOT pipeline, supported by the authors’ open-source codebase available on GitHub. Real-world tests across controlled settings and active traffic scenarios confirmed improvements in tracking precision, robustness, and processing speed. The research lays groundwork for more scalable, autonomous multimodal sensing systems applicable to self-driving cars, robotics, and advanced driver assistance technologies.
Source:
Source:
Lei Cheng, Siyang Cao. ‘Radar-Camera Fused Multi-Object Tracking: Online Calibration and Common Feature.’ arXiv:2510.20794 [cs.CV], 2025. https://arxiv.org/abs/2510.20794

