Beyond 3D Siamese Tracking:
A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds

CVPR 2022 (Oral)

Chaoda Zheng, Xu Yan, Haiming Zhang, Baoyuan Wang, Shenghui Cheng, Shuguang Cui, Zhen Li

The Chinese University of Hong Kong, Shenzhen

Motivation & Method

For single object tracking in LiDAR scenes (LiDAR SOT), previous methods rely on appearance matching to localize the target using a target template.

Siamese matching paradigm

However, as shown in the following figure, matching-based approaches become unreliable when dealing with drastic appearance changes and distractors, which commonly exist in LiDAR scenes.

Distracted cases

Since the task deals with a dynamic scene across a video sequence, the target's movements among successive frames provide useful cues to distinguish distractors and handle appearance changes. We for the first time present a motion-centric paradigm to handle LiDAR SOT. By explicitly learning from various "relative target motions" in data, the paradigm robustly localize the target in the current frame via motion transformation.

Motion-Centric paradigm

Based on the motion-centric paradigm, a two-stage tracker M^2-Track is proposed. At 1 st-stage, M^2-Track localizes the target within successive frames via motion transformation. Then it refines the target box through motion-assisted shape completion at 2nd-stage. M^2-Track significantly outperforms the previous SOTAs and further shows its potential when simply integrated with appearance-matching.

M^2-Track Architecture

Distractor Statistics

Distributions of distractors for car/vehicle objects on different datasets:

distractor statistics


distractor statistics

NuScenes and Waymo are more challenging for matching-based approaches due to widespread distractors in scenes. But M^2-Track robustly handles distractors via explicit motion modeling.

Quantitative Results

NuScenes & Waymo
results on nuscenes and waymo
Comparison & Behavior Analysis in KITTI
results on KITTI
robustness to distractors robustness to distractors

Qualitative Results

Tracking on Cars
results on nuscenes and waymo
Tracking on Pedestrian
results on nuscenes and waymo


                title={Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds},
                author={Zheng, Chaoda and Yan, Xu and Zhang, Haiming and Wang, Baoyuan and Cheng, Shenghui and Cui, Shuguang and Li, Zhen},
                journal={arXiv preprint arXiv:2203.01730},