DOI QR코드

DOI QR Code

Pointwise CNN for 3D Object Classification on Point Cloud

  • Song, Wei (School of Information Science and Technology, North China University of Technology) ;
  • Liu, Zishu (School of Information Science and Technology, North China University of Technology) ;
  • Tian, Yifei (Dept. of Computer and Information Science, University of Macau) ;
  • Fong, Simon (Dept. of Computer and Information Science, University of Macau)
  • 투고 : 2019.10.11
  • 심사 : 2020.05.29
  • 발행 : 2021.08.31

초록

Three-dimensional (3D) object classification tasks using point clouds are widely used in 3D modeling, face recognition, and robotic missions. However, processing raw point clouds directly is problematic for a traditional convolutional network due to the irregular data format of point clouds. This paper proposes a pointwise convolution neural network (CNN) structure that can process point cloud data directly without preprocessing. First, a 2D convolutional layer is introduced to percept coordinate information of each point. Then, multiple 2D convolutional layers and a global max pooling layer are applied to extract global features. Finally, based on the extracted features, fully connected layers predict the class labels of objects. We evaluated the proposed pointwise CNN structure on the ModelNet10 dataset. The proposed structure obtained higher accuracy compared to the existing methods. Experiments using the ModelNet10 dataset also prove that the difference in the point number of point clouds does not significantly influence on the proposed pointwise CNN structure.

키워드

과제정보

This research was funded by the MSIT (Ministry of Science, ICT), Korea, under the High-Potential Individuals Global Training Program (No. 2020-0-01576) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation), National Nature Science Foundation of China (No. 61503005), the Great Wall Scholar Program (No. CIT&TCD20190305), and NCUT funding (No. 110052972027/008).

참고문헌

  1. K. W. Bowyer, K. Chang, and P. Flynn, "A survey of approaches and challenges in 3D and multi-modal 3D+ 2D face recognition," Computer Vision and Image Understanding, vol. 101, no. 1, pp. 1-15, 2006. https://doi.org/10.1016/j.cviu.2005.05.005
  2. A. S. Mian, M. Bennamoun, and R. A. Owens, "Automatic correspondence for 3D modeling: an extensive review," International Journal of Shape Modeling, vol. 11, no. 2, pp. 253-291, 2005. https://doi.org/10.1142/S0218654305000797
  3. M. J. Gomez, F. Garcia, D. Martin, A. de la Escalera, and J. M. Armingol, "Intelligent surveillance of indoor environments based on computer vision and 3D point cloud fusion," Expert Systems with Applications, vol. 42, no. 21, pp. 8156-8171, 2015. https://doi.org/10.1016/j.eswa.2015.06.026
  4. Y. Choe, S. Ahn, and M. J. Chung, "Online urban object recognition in point clouds using consecutive point information for urban robotic missions," Robotics and Autonomous Systems, vol. 62, no. 8, pp. 1130-1152, 2014. https://doi.org/10.1016/j.robot.2014.04.007
  5. A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik, "Recognizing objects in range data using regional point descriptors," in Computer Vision - ECCV 2004. Heidelberg, Germany: Springer, 2004, pp. 224-237
  6. C. S. Chua and R. Jarvis, "Point signatures: a new representation for 3d object recognition," International Journal of Computer Vision, vol. 25, no. 1, pp. 63-85, 1997. https://doi.org/10.1023/A:1007981719186
  7. A. Aldoma, M. Vincze, N. Blodow, D. Gossow, S. Gedikli, R. B. Rusu, and G. Bradski, "CAD-model recognition and 6DOF pose estimation using 3D cues," in Proceedings of 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, 2011, pp. 585-592.
  8. W. Hao, Y. H. Wang, X. J. Ning, W. Liang, and Z. H. Shi, "Survey of 3D object recognition for point clouds," Computer Science, vol. 44, no. 9, pp. 11-16, 2017.
  9. A. Garcia-Garcia, F. Gomez-Donoso, J. Garcia-Rodriguez, S. Orts-Escolano, M. Cazorla, and J. Azorin-Lopez, "PointNet: a 3D convolutional neural network for real-time object class recognition," in Proceedings of 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, Canada, 2016, pp. 1578-1584.
  10. S. Zhou and S. Xiao, "3D face recognition: a survey," Human-centric Computing and Information Sciences, vol. 8, article no. 35, 2018. https://doi.org/10.1186/s13673-018-0157-2
  11. B. Drost, M. Ulrich, N. Navab, and S. Ilic, "Model globally, match locally: efficient and robust 3D object recognition," in Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, 2010, pp. 998-1005.
  12. R. B. Rusu, G. Bradski, R. Thibaux, and J. Hsu, "Fast 3D recognition and pose using the viewpoint feature histogram," in Proceedings of 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 2010, pp. 2155-2162.
  13. R. B. Rusu, N. Blodow, and M. Beetz, "Fast point feature histograms (FPFH) for 3D registration," in Proceedings of 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan, 2009, pp. 3212-3217.
  14. R. B. Rusu, A. Holzbach, M. Beetz, and G. Bradski, "Detecting and segmenting objects for mobile manipulation," in Proceedings of 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), Kyoto, Japan, 2009, pp. 47-54.
  15. Z. C. Marton, D. Pangercic, N. Blodow, and M. Beetz, "Combined 2D-3D categorization and classification for multimodal perception systems," The International Journal of Robotics Research, vol. 30, no. 11, pp. 1378-1402, 2011. https://doi.org/10.1177/0278364911415897
  16. W. Wohlkinger and M. Vincze, "Ensemble of shape functions for 3d object classification," in Proceedings of 2011 IEEE International Conference on Robotics and Biomimetics, Karon Beach, Thailand, 2011, pp. 2987-2992.
  17. T. Chen, B. Dai, D. Liu, and J. Song, "Performance of global descriptors for velodyne-based urban object recognition," in Proceedings of 2014 IEEE Intelligent Vehicles Symposium Proceedings, Dearborn, MI, 2014, pp. 667-673.
  18. A. E. Johnson and M. Hebert, "Using spin images for efficient object recognition in cluttered 3D scenes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 5, pp. 433-449, 1999. https://doi.org/10.1109/34.765655
  19. Y. Guo, F. A. Sohel, M. Bennamoun, M. Lu, and J. Wan, "TriSI: a distinctive local surface descriptor for 3D modeling and object recognition," in Proceedings of the International Conference on Computer Graphics Theory and Applications and International Conference on Information Visualization Theory and Applications (GRAPP & IVAPP), Barcelona, Spain, 2013, pp. 86-93.
  20. R. B. Rusu, N. Blodow, Z. C. Marton, and M. Beetz, "Aligning point cloud views using persistent feature histograms," in Proceedings of 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, 2008, pp. 3384-3391.
  21. S. Salti, F. Tombari, and L. Di Stefano, "SHOT: unique signatures of histograms for surface and texture description," Computer Vision and Image Understanding, vol. 125, pp. 251-264, 2014. https://doi.org/10.1016/j.cviu.2014.04.011
  22. S. M. Prakhya, B. Liu, and W. Lin, "B-SHOT: a binary feature descriptor for fast and efficient keypoint matching on 3D point clouds," in Proceedings of 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 2015, pp. 1929-1934.
  23. Y. He and Y. Mei, "An efficient registration algorithm based on spin image for LiDAR 3D point cloud models," Neurocomputing, vol. 151, pp. 354-363, 2015. https://doi.org/10.1016/j.neucom.2014.09.029
  24. M. Maimaitimin, K. Watanabe, and S. Maeyama, "Surface-common-feature descriptor of point cloud data for deep learning," in Proceedings of 2016 IEEE International Conference on Mechatronics and Automation, Harbin, China, 2016, pp. 525-529.
  25. H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, "Multi-view convolutional neural networks for 3D shape recognition," in Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015, pp. 945-953.
  26. D. Bobkov, S. Chen, R. Jian, M. Z. Iqbal, and E. Steinbach, "Noise-resistant deep learning for object classification in three-dimensional point clouds using a point pair descriptor," IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 865-872, 2018. https://doi.org/10.1109/lra.2018.2792681
  27. Y. Ben-Shabat, M. Lindenbaum, and A. Fischer, "3DmFV: three-dimensional point cloud classification in real-time using convolutional neural networks," IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3145-3152, 2018. https://doi.org/10.1109/lra.2018.2850061
  28. Y. Li, S. Pirk, H. Su, C. R. Qi, and L. J. Guibas, "FPNN: field probing neural networks for 3D data," Advances in Neural Information Processing Systems, vol. 29, 307-315, 2016.
  29. Y. Zhou and O. Tuzel, "VoxelNet: end-to-end learning for point cloud based 3d object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 4490-4499.
  30. M. Engelcke, D. Rao, D. Z. Wang, C. H. Tong, and I. Posner, "Vote3deep: fast object detection in 3D point clouds using efficient convolutional neural networks," in Proceedings of 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 2017, pp. 1355-1361.
  31. Y. Fang, J. Xie, G. Dai, M. Wang, F. Zhu, T. Xu, and E. Wong, "3D deep shape descriptor," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 2319-2328.
  32. R. Klokov and V. Lempitsky, "Escape from cells: deep Kd-networks for the recognition of 3D point cloud models," in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 863-872.
  33. O. Vinyals, S. Bengio, and M. Kudlur, "Order matters: Sequence to sequence for sets," 2015 [Online]. Available: https://arxiv.org/abs/1511.06391.
  34. Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, "3D ShapeNets: a deep representation for volumetric shapes," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 1912-1920.