SLAM-for-AR Competition @ ISMAR2019

Click here to visit our competition Homepage.

The competition results and the system descriptions has been published.
Competition Results - V-SLAM
Competition Results - VI-SLAM

Download

Dataset Format

Each sequence would provide several ‘sensors’ along with their sensor.yaml file that specifies sensor type, intrinsic and extrinsic parameters. The sensor measurements(or measurement indices, for camera) is stored in data.csv. In our case, both camera and IMU data would be provided. The vicon and groundtruth is also treated like a ‘sensor’. Here is an example:

A01
|--camera
|   |--sensor.yaml
|   |--data.csv
|   `--data
|     |--771812250517066.png
|     |--771812283849357.png
|     `--...
|--imu
|   |--sensor.yaml
|   `--data.csv
|--vicon
|   |--sensor.yaml
|   `--data.csv
`--groundtruth
    |--sensor.yaml
    `--data.csv

Evaluation Instruction

The Format of Submission Result

The estimated 6 DoF camera poses (from camera coordinate to the world coordinate) are required to evaluate the performance. Considering that there is a certain randomness of estimation, each sequence is required to be run for 5 times, resulting in 5 pose files and 5 running time files. We will select the median result from all five results for evaluation. It should be noted that : The format for each pose file is described as follows:

timestamp[i] p_x p_y p_z q_x q_y q_z q_w

where (p_x, p_y, p_z) is the camera position, and the unit quaternion (q_x, q_y, q_z, q_w)is the camera orientation. You should output the real-time poses after each frame is processed (not the poses after final global optimization), and the output of poses should be in the same frame rate as the input camera images (Otherwise, the completeness evaluation would be affected).

The format for each running time file is described as follows

timestamp[i] t_pose

where t_pose denotes the system time (or cumulative running time in seconds and at least three decimal places, even for black frames in D6) when the pose is estimated.

Please submit a zip file containing all the poses and running time files. The structure of zip file should follow the form described as follows:

YourSLAMName/sequence_name/Round-pose.txt
YourSLAMName/sequence_name/Round-time.txt

e.g.
MY-SLAM/C0_test/0-pose.txt
MY-SLAM/C0_test/0-time.txt

You can click here to download the example.

Evaluation

We evaluate the overall performance of a SLAM system considering tracking accuracy, initialization quality, tracking robustness, relocalization time and the computation efficiency. The criteria are as follows:

  • $\color{black}{\varepsilon_{APE} / \varepsilon_{ARE}}$ - absolute positional / rotational error
  • $\color{black}{\varepsilon_{RPE} / \varepsilon_{RRE}}$ - relative positional / rotational error
  • $\color{black}{\varepsilon_{bad}}$ - the ratio of bad poses (100% - completeness)
  • $\color{black}{\varepsilon_{init}}$ - initialization quality
  • $\color{black}{\varepsilon_{RO}}$ - tracking robustness
  • $\color{black}{t_{RL}}$ - relocalization time

The detailed description of the above criteria can be found in the following paper:

Jinyu Li, Bangbang Yang, Danpeng Chen, Nan Wang, Guofeng Zhang, Hujun Bao. Survey and evaluation of monocular visual-inertial SLAM algorithms for augmented reality. Journal of Virtual Reality & Intelligent Hardware, 2019, 1(4): 386-410. DOI:10.3724/SP.J.2096-5796.2018.0011. URL: http://www.vr-ih.com/vrih/html/EN/10.3724/SP.J.2096-5796.2018.0011/article.html

We convert each criteria error $\color{black}{\varepsilon_{i}}$ into a normalized score by $\color{black}{s_i=\frac{{\sigma_i}^2}{{\sigma_i}^2+{\varepsilon_i}^2}\times100\%}$ , where $\color{black}{{\sigma_i}^2}$ is the variance controlling the normalization function shape. The complete score is a weighted sum of all the individual scores as:

$\color{black}{S=w_{APE}s_{APE}+w_{ARE}s_{ARE}+w_{RPE}s_{RPE}+w_{RRE}s_{RRE}+w_{bad}s_{bad}+w_{init}s_{init}+w_{RO}s_{RO}+w_{RL}s_{RL}}$

The weight $\color{black}{w}$ and variance $\color{black}{\sigma}$ (V-SLAM / VI-SLAM) for each criteria are listed below:

$\color{black}{w_{APE}}$ $\color{black}{w_{ARE}}$ $\color{black}{w_{RPE}}$ $\color{black}{w_{RRE}}$ $\color{black}{w_{bad}}$ $\color{black}{w_{init}}$ $\color{black}{w_{RO}}$ $\color{black}{w_{RL}}$
1.0 1.0 0.5 0.5 1.0 1.0 1.0 1.0
$\color{black}{\sigma_{APE}}$ $\color{black}{\sigma_{ARE}}$ $\color{black}{\sigma_{RPE}}$ $\color{black}{\sigma_{RRE}}$ $\color{black}{\sigma_{bad}}$ $\color{black}{\sigma_{init}}$ $\color{black}{\sigma_{RO}}$ $\color{black}{\sigma_{RL}}$
72.46 / 55.83 7.41 / 2.48 6.72 / 2.92 0.26 / 0.17 20.68 / 2.38 2.79 / 1.85 2.27 / 0.95 0.65 / 1.42

The variance $\color{black}{\sigma}$ list above is obtained by computing the median of our previous evaluation results, which contain the results of 4 V-SLAM systems (PTAM, ORB-SLAM2, LSD-SLAM, DSO) and 4 VI-SLAM systems (MSCKF, OKVIS, VINS-Mono, SenseSLAM) evaluated on our previous released dataset.

You can evaluate your SLAM system with our training dataset using the evaluation tool: https://github.com/zju3dv/eval-vislam.

In the final round competition, we will test all systems on benchmarking PCs with the same hardware configuration. The running time will be taken into account for computing the final score according to the following equation:

$\color{black}{S^*=\frac{\text{min}(30,framerate)}{30}S}$

where $\color{black}{framerate}$ denotes the average framerate of the system.

It should be noted that not all sequences are evaluated for all the critera. The corresponding critera for the sequences are listed below:

Sequences Corresponding Critera
C0-C11, D8-D10 APE, RPE, ARE, RRE, Badness, Initialization Quality
D0-D4 Tracking Robustness
D5-D7 Relocalization Time

Motion and Scene Type of Sequences

Sequence Motion Scene Description
Xiaomi MI8 C0 inspect+patrol floor Walking and looking around the glossy floor.
C1 inspect+patrol clean Walking around some texture-less areas.
C2 inspect+patrol mess Walking around some random objects.
C3 aiming+inspect mess+floor Random objects first, and then glossy floor.
C4 aiming+inspect desktop+clean From a small scene to a texture-less area.
C5 wave+inspect desktop+mess From a small scene to a texture-rich area.
C6 hold+inspect desktop Looking at a small desktop scene.
C7 inspect+aiming desktop Looking at a small desktop scene.
C8 inspect+forward clean Walking forward in texture-less area at low position.
C9 inspect+forward mess Walking forward in texture-rich area at low position.
C10 inspect+downward clean Walking backward in texture-less area at low position.
C11 inspect+downward mess Walking backward in texture-rich area at low position.
D0 rapid-rotation desktop Rotating the phone rapidly at some time.
D1 rapid-translation desktop Moving the phone rapidly at some time.
D2 rapid-shaking desktop Shaking the phone violently at some time.
D3 inspect moving people A person walks in and out.
D4 inspect covering camera An object occasionally occluding the camera.
D5 inspect desktop Similar to A6 but with black frames.
D6 inspect desktop Similar to A6 but with black frames.
D7 inspect desktop Similar to A6 but with black frames.
D8 inspect foreground+background Walking around the near plane and far plane.
D9 inspect plant Walking around the plant.
D10 loop office Walking around the office with loop closure.

Dataset Preview

Video Source From YouTube

Video Source From bilibili

Competition Results - V-SLAM

$\text{Final Score} = \text{Benchmark Score} \times \text{Speed Penalty} \times 100$

Click here to download the detailed result.

  • Configuration of the benchmark PC:

    • CPU : i7-9700K 3.60GHz

    • Memory : 32G

    • GPU : Nvidia RTX 2070-8G

    • Hard Disk : Samsung SSD 850EVO 500G

Rank Participants Affiliation System Name APE RPE ARE RRE Badness Initialization Quality Robustness Relocalization Time Benchmark Score Average FrameRate Speed Penalty Final Score System Description
1 Zike Yan, Pijian Sun, Xin Wang, Shunkai Li, Sheng Zhang, Hongbin Zha Key Laboratory of Machine Perception(MOE), School of EECS, Peking University LF-SLAM 0.7155 0.4037 0.8115 0.2078 0.4554 0.7166 0.9972 0.9965 0.8331 30.0687 1.0000 83.31 Robust line tracking for monocular visual system (Doc would be uploaded after the acceptance of the paper)
2 Darius Rueckert University of Erlangen-Nuremberg Snake-SLAM 0.7543 0.5700 0.8229 0.2709 0.5183 0.7818 0.6685* 0.9974 0.8273* 106.1527 1.0000 82.73* Doc
3 Neo Yuan Rong Dexter, Toh Yu Heng Pensees AR-ORB-SLAM2 0.5511 0.3850 0.5280 0.2028 0.6020 0.7080 0.9947 0.9890 0.7778 31.0803 1.0000 77.78
4 Xinyu Wei, Zengming Tang, Huiyan Wu, Jun Huang Shanghai Advanced Research Institute, Chinese Academy of Sciences PL-SLAM 0.6036 0.4754 0.7354 0.2376 0.4551 0.8062 0.9985 0.9916 0.8245 27.5107 0.9170 75.61 Doc Slides
5 Ao Li, Yue Ni University of Science and Technology of China Dy-SLAM 0.7588 0.4390 0.7953 0.2002 0.5776 0.6018 0.9981 0.8578 0.8182 3.8851 0.1295 10.60 Doc

* Note: the output trajectory files by Snake-SLAM (the final submitted executable program during the competition) denote the invalid poses by an identity one, i.e. (0 0 0 -0 -0 -0 1). Unfortunately, it does not fit the invalid pose format that we define, so our evaluation tool still regarded them as valid poses. As a result, the computed robustness score was affected. If we remove these invalid poses, the new robustness score of Snake-SLAM becomes 0.9348, and the final score is 87.17. Since this format issue was discovered after the competition, and did not appear in the results of other teams, the competition ranking can no longer be changed.

Competition Results - VI-SLAM

$\text{Final Score} = \text{Benchmark Score} \times \text{Speed Penalty} \times 100$

Click here to download the detailed result.

  • Configuration of the benchmark PC:

    • CPU : i7-9700K 3.60GHz

    • Memory : 32G

    • GPU : Nvidia RTX 2070-8G

    • Hard Disk : Samsung SSD 850EVO 500G

Rank Participants Affiliation System Name APE RPE ARE RRE Badness Initialization Quality Robustness Relocalization Time Benchmark Score Average FrameRate Speed Penalty Final Score System Description
1 Shaozu Cao, Jie Pan, Jieqi Shi, Shaojie Shen Hong Kong University of Science and Technology VINS-Mono 0.6341 0.4225 0.4945 0.2429 0.8175 0.5678 0.8037 0.8572 0.7513 30.1062 1.0000 75.13 Doc Slides
2 Xinyu Wei, Zengming Tang, Huiyan Wu, Jun Huang Shanghai Advanced Research Institute, Chinese Academy of Sciences PLVI-SLAM 0.2767 0.0994 0.3383 0.1675 0.0097 0.2813 0.8183 0.6654 0.4205 18.3380 0.6113 25.71 Doc
3 Jianhua Zhang, Shengyong Chen, Mengping Gui, Jialing Liu, Luzhen Ma, Kaiqi Chen Zhejiang University of Technology MMF-SLAM 0.1203 0.0834 0.1760 0.1407 0.0010 0.3214 0.0102 0.0000 0.1235 29.9620 0.9987 12.33 Doc Slides

Competition Chairs

Guofeng Zhang

Zhejiang University, China

Jing Chen

Beijing Institute of Technology, China

Guoquan Huang

University of Delaware, USA

Acknowledgement

We thank Bangbang Yang for his great help in preparing competition dataset, building the website and evaluating the participating SLAM systems.