TP-AE: Temporally Primed 6D Object Pose Tracking with Auto-Encoders
Linfang Zheng
Ales Leonardis
Tze Ho Elden Tse
Nora Horanyi
Hua Chen
Wei Zhang
Hyung Jin Chang
[Paper]
[GitHub]

Abstract

Fast and accurate tracking of an object's motion is one of the key functionalities of a robotic system for achieving reliable interaction with the environment. This paper focuses on the instance-level six-dimensional (6D) pose tracking problem with a symmetric and textureless object under occlusion. We propose a Temporally Primed 6D pose tracking framework with Auto-Encoders (TP-AE) to tackle the pose tracking problem. The framework consists of a prediction step and a temporally primed pose estimation step. The prediction step aims to quickly and efficiently generate a guess on the object's real-time pose based on historical information about the target object's motion. Once the prior prediction is obtained, the temporally primed pose estimation step embeds the prior pose into the RGB-D input, and leverages auto-encoders to reconstruct the target object with higher quality under occlusion, thus improving the framework's performance. Extensive experiments show that the proposed 6D pose tracking method can accurately estimate the 6D pose of a symmetric and textureless object under occlusion, and significantly outperforms the state-of-the-art on T-LESS dataset while running in real-time.


Code

The code will be released soon.

 [GitHub]


Paper and Supplementary Material

L. Zheng, A. Leonardis, T. H. Tse, N. Horanyi, H. Chen, W. Zhang, H. J. Chang
TP-AE: Temporally Primed 6D Object Pose Tracking with Auto-Encoders
ICRA, 2022.
(hosted on ArXiv)


[Bibtex]


Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants 62073159 and 62003155, the Shenzhen Science and Technology Program under Grant JCYJ20200109141601708, the Science, Technology and Innovation Commission of Shenzhen Municipality under grant ZDSYS20200811143601004, and the Institute of Information and communications Technology Planning and evaluation (IITP) grant funded by the Korea government (MSIT) (2021-0-00537, Visual common sense through self-supervised learning for restoration of invisible parts in images). (Corresponding author: Wei Zhang.).