Pedestrian Detection with Depth-guided Structure Labeling

Abstract

We propose a principled statistical approach for using 3D information and scene context to reduce the number of false positives in stereo based pedestrian detection. Current pedestrian detection algorithms have focused on improving the discriminability of 2D features that capture the pedestrian appearance, and on using various classifier architectures. However, there has been less focus on exploiting the geometry and spatial context in the scene to improve pedestrian detection performance. We make several contributions: (i) we define a new 3D feature, called a Vertical Support Histogram, from dense stereo range maps to locally characterize 3D structure; (ii) we estimate the likelihoods of these 3D features using kernel density estimation, and use them within a Markov Random Field (MRF) to enforce spatial constraints between the features, and to obtain the Maximum A-Posteriori (MAP) scene labeling; (iii) we employ the MAP scene labelings to reduce the number of candidate windows that are tested by a standard, state-of-the-art pedestrian appearance classifier. We evaluate our algorithm on a very challenging, publicly available stereo dataset and compare the performance with state-of-the-art methods.

Publication
IEEE Workshop on Search in 3D and Video (ICCV Workshop)
Date
Links