LiDAR Point Clouds & Applications

Overview

In this post, we are targetted at high-definition LiDAR point clouds that are ubiquitously used in autonomous driving. Typically, point clouds acquired by Terrestrial LiDAR (cf. Airborne LiDAR which typically mounted on aircrafts) sensors possess ~100k points, which are sparse actually and have large variations in density.

Figure1. Lidar mounted on Apple's 'Project Titan' test Lexus SUV.

Applications

1. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Figure 2. Architecture of PointNet

Point clouds are not ideal data format for deep learning operators due to their irregular properties and inherent variance. PointNet addresses the extraction of semantic knowledge of point clouds that is invariant to point order, point density and euclidean transformation. To conclude:

2. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

Figure 3. Architecture of VoxelNet

Objection detection, like cars, pedestrians and cyclists, is a critical component of driverless cars’ scene understanding. Although being largely quiet on the self-driving efforts, Apple research releases an archive on LiDAR-only detection of 3D objects on 2017/11/17.

Voxel feature netowrk. The large body of point clouds is first divided into regular voxels and then each non-empty voxel is described by a compact feature vector by proposed feature network. As shown in Figure 2, the feature network is a bit similar to PointNet. The input of each point is a 6-dimensional vector \(\mathbf{v} = [x, y, z, x-c_x, y-c_y, z-c_z]\) where \([c_x, c_y, c_z]^T\) is the centroid of points in this voxel. Then a stack of voxel feature encoding (VFE) layers produce point-wise descriptors for points and voxel-wise descriptors for voxels. The voxel-wise descriptor is essentially element-wise maximum of point-wise descriptors residing in the voxel. (We refer the reader to the paper for details.)

Region Proposal Network. The voxel grids and generated voxel-wise descriptors form regular feature map for objection detection. Region proposal network (RPN), formerly used in 2D cases, is trimmed here to predict 3D bounding boxes and their probability scores.

3. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Figure 3. Architecture of PointNet++

As pointed out by VoxelNet, PointNet lacks generalization to large-scale point set data. It is manipulated on independent points and is consequently blind to local structures of point sets. PointNet++, an extension of PointNet, is proposed to capture local structural information in a hierachical manner.

Set Abstraction. The core of PointNet++ is called set abstraction. It subsamples point set by fathest point sampling (greedily sample the most distant point from other points). Then it encodes each sampled point wih PointNet that considers the points inside the spherical neighborhood of the point (termed group in this paper). Again, the sampled points with encoded features are seen as a new point set input of set abstration layer. Finally, the recursively-sampled point set is greatly contracted to be used for classification and segmentation.

For the segmentation task, it is necessary to assign a label to each point in original point set rather than the sampled points. To this end, an interpolation strategy is used to restore descriptors for eliminated points as weighted sum of neighboring sampled points in the reverse direction of sampling.