We present a network architecture for processing point clouds that directly operates on a collection of points represented as a sparse set of samples in a high-dimensional lattice. NaÃely applying convolutions on this lattice scales poorly, both in terms of memory and computational cost, as the size of the lattice increases. Instead, our network uses sparse bilateral convolutional layers as building blocks. These layers maintain efficiency by using indexing structures to apply convolutions only on occupied parts of the lattice, and allow flexible specifications of the lattice structure enabling hierarchical and spatially-aware feature learning, as well as joint 2D-3D reasoning. Both point-based and image-based representations can be easily incorporated in a network with such layers and the resulting model can be trained in an end-to-end manner. We present results on 3D segmentation tasks where our approach outperforms existing state-of-the-art techniques.
|Title of host publication||Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018|
|Publisher||IEEE Computer Society|
|Number of pages||10|
|Publication status||Published - 2018 Dec 14|
|Event||31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 - Salt Lake City, United States|
Duration: 2018 Jun 18 → 2018 Jun 22
|Name||Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition|
|Conference||31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018|
|City||Salt Lake City|
|Period||18/6/18 → 18/6/22|
Bibliographical noteFunding Information:
Maji acknowledges support from NSF (Grant No. 1617917). Kalogerakis acknowledges support from NSF (Grant No. 1422441 and 1617333). Yang acknowledges support from NSF (Grant No. 1149783). We acknowledge the MassTech Collaborative grant for funding the UMass GPU cluster.
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition