A Benchmark for 3D Object Tracking in Driving Environments (3D-OTD) v1.0

Research in autonomous driving has matured significantly with major automotive companies making important efforts to have their autonomous or highly automated cars available in the near future. As driverless cars move from laboratories to complex real-world environments, the perception capabilities of these systems to acquire, model and interpret the 3D spatial information must become more powerful. Object tracking is one of the challenging problems of autonomous driving in 3D dynamic environments. Although different approaches are proposed for object tracking with demonstrated success in driving environments, it is very difficult to evaluate and compare them because they are defined with various constraints and boundary conditions. Evaluation of appearance modeling for object tracking in the driving environments, using a multimodal perception system of autonomous cars and advanced driver assistance systems (ADASs) is the research focus of this benchmark dataset, called 3D Object Tracking in Driving Environments (3D-OTD). 
Description of 3D-OTD Dataset:
In the original KITTI dataset, objects are annotated with their tracklets, and the dataset is more focused on the evaluation of data association problem in discriminative approaches. Our goal is to provide a tool for the assessment of object appearance modeling in both the discriminative and generative methods. Therefore, instead of tracklets, the full track of each object is extracted. A benchmark dataset with 50 annotated sequences is constructed out of the ‘KITTI Object Tracking Evaluation’ to facilitate the performance evaluation. In the constructed benchmark dataset, each sequence denotes a trajectory of only one target object (i.e., if one scenario includes two target objects, it is considered as two sequences). The initial frames of sequences are shown in Figure below, and the general specifications of each sequence and the most challenging factors are extracted and reported in Table. The table contains the description of the scene, sequence, and objects including the number of frames for each sequence, object type: car ‘C’, pedestrian ‘P’ and cyclist ‘Y’, object and Ego-vehicle situations: moving ‘M’ or stationary ‘S’, scene condition: roads in urban environment ‘U’ or alleys and avenues in downtown ‘D’. The object width (Im-W) and height (Im-H) in the first frame (in pixels), and width (PCD-W), height (PCD-H), and length (PCD-L) in the first PCD (in meters) of each sequence are also reported. Each of the sequences is categorized according to the following challenges: occlusion (OCC), object pose (POS) and distance (DIS) variations to Ego-vehicle, and changes in the relative velocity (RVL) of the object to the Ego-vehicle.

How to use 3D-OTD Dataset:
3D-OTD dataset (50 sequences in total) derived from ‘KITTI – Object Tracking Evaluation’ dataset, thus you first need to download KITTI dataset. Next, you need to download Groundtruth Labels which contains the full track of 50 sequences, and Baseline Trackers which contains 3D-KF and 3D-MS baseline trackers.

To make the baseline tracker codes work, you will need to download and extract the ‘Groundtruth Labels’ and ‘Baseline Trackers’ files in the same directory as the ‘KITTI – Object Tracking Evaluation’ folder and simply run ‘demo_3D_KF.m’ and ‘demo_3D_MS.m’ to run the baseline Kalman Filter and baseline Mean Shift Trackers. To select each object, select the object id in the demo files.