Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas (2017)
- Novel deep net architecture that directly consumes point clouds, without voxelization or rendering
- Unified architecture that learns both global and local point features
- Output: class category score, feature map
- Using point clouds directly avoids quantization artifacts, high-cost computations and the need for pre-processing
- Properties of point clouds:
- Unordered: so network that consumes N 3D points needs to be invariant to N! permutations in feeding order
- Interaction among points: neighboring points are useful --> model needs to be able to capture local structure from nearby points, and the combinatorial interactions among local structures
- Invariance under transformations: rotating and translating points all together shouldn't change category/segmentation
- 3 solutions for this:
- Use symmetric function: max pooling to aggregate information from all points (invariant to input order)
- Combining local and global features: concatenate global feature with each of the point features --> network is able to predict per-point quantities that rely on local geometry and global semantics
- Joint alignment network: learn the affine transformation and apply to both points and features + add regularization term
- Network learns to summarize a point cloud by a sparse set of key points (skeleton of objects): global features
- Highly robust to small perturbations of inputs and corruption (through inserting outliers or deletion)