DOI QR코드

DOI QR Code

조건부 랜덤 필드와 컨볼루션 신경망을 이용한 의미론적인 객체 분할 방법

Semantic Segmentation using Convolutional Neural Network with Conditional Random Field

  • 투고 : 2017.04.21
  • 심사 : 2017.06.16
  • 발행 : 2017.06.30

초록

컴퓨터비전에서 가장 기본적이고, 복잡한 문제를 수반하는 의미론적 분할(Semantic segmentation)은 이미지의 각 픽셀을 특정 객체로 분류하며, 레이블(label)을 지정하는 작업을 수행한다. 기존에 연구되어온 확률적 그래프 모델인 MRF와 CRF는 픽셀 수준의 라벨링 작업의 정확도를 높이는 효과적인 방법으로 연구되어왔다. 본 논문에서는 최근 각광받고 있는 딥러닝의 한 부류인 CNN과 확률 모델인 CRF를 결합한 형태의 의미론적 분할 방법을 제안하였다. 학습과 성능 검증을 위하여 Pascal VOC 2012 이미지 데이터베이스를 사용하였고, 학습에 사용되지 않은 임의의 이미지를 이용하여 테스트를 진행 하였다. 연구의 결과로서 기존 의미론적 분할 알고리즘보다 더욱 뛰어난 분할 성능을 보여주었다.

Semantic segmentation, which is the most basic and complicated problem in computer vision, classifies each pixel of an image into a specific object and performs a task of specifying a label. MRF and CRF, which have been studied in the past, have been studied as effective methods for improving the accuracy of pixel level labeling. In this paper, we propose a semantic partitioning method that combines CNN, a kind of deep running, which is in the spotlight recently, and CRF, a probabilistic model. For learning and performance verification, Pascal VOC 2012 image database was used and the test was performed using arbitrary images not used for learning. As a result of the study, we showed better partitioning performance than existing semantic partitioning algorithm.

키워드

참고문헌

  1. K. Alex, S. Ilya, and H. Geoffrey, "ImageNet Classification with Deep Convolutional Neural Networks," In Proc. of Advances in Neural Information Processing System, Los Angeles, USA, Dec, 2012, pp. 1097-1105.
  2. S. Bang, "Implementation of Image based Fire Detection System Using Convolution neural Network," J. of the Korea Institute of Electronic Communication, vol. 12, no. 2, 2017. pp. 331-336.
  3. S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," In Proc. of Advances in neural Information Processing Systems, Montreal, CAN, Dec, 2015, pp. 91-99.
  4. S. Ji, W. Xu, M. Yang, and K. Yu, "3D convolutional neural networks for human action recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 35, no. 1, 2013, pp. 221-231. https://doi.org/10.1109/TPAMI.2012.59
  5. C. Farabet, C. Couprie, L. Najman, and Y. LeCun, "Learning hierarchical features for scene labeling," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 35, no. 8, 2013, pp. 1915-1929. https://doi.org/10.1109/TPAMI.2012.231
  6. M. Mostajabi, P. Yadollahpour, and G. Shakhnarovich, "Feedforward semantic segmentation with zoom-out features," In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Boston, USA, Jun, 2015, pp. 3376-3385.
  7. B. Hariharan, P. Arbelez, R. Girshick, and J. Malik, "Hypercolumns for object segmentation and fine-grained localization," In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Boston, USA, Jun, 2015, pp. 447-456.
  8. B. Hariharan, P. Arbelez, R. Girshick, and J. Malik, "Simultaneous detection and segmentation." European Conf. on Computer Vision, Zurich, CHE, Sept, 2014, pp. 297-312.
  9. J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Boston, USA, Jun, 2015, pp. 3431-3440.
  10. L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. Yuille, "Semantic image segmentation with deep convolutional nets and fully connected CRFs," In Proc. of Int. Conf. on Learning Representations, San Diego, USA, May, 2015.
  11. L. Chen, A. Schwing, A. Yuille, and R. Urtasun, "Learning deep structured models," In Proc. of Int. Conf. on Machine Learning, Lille, FRA, Jul, 2015, pp. 1785-1794.
  12. S. Zheng, S. Jayasumana, B. Romera-Paredes, B. Vineet, Z. Su, D. Du, C. Huang, and P. Torr, "Conditional random fields as recurrent neural networks," In Proc. of the IEEE Int. Conf. on Computer Vision, Santiago, CHL, Dec, 2015, pp. 1529-1537.
  13. O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," Int. Conf. on Medical Image Computing and Computer-Assisted Intervention, Munich, DEU, Oct, 2015, pp. 234-241.
  14. K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," In Proc. of Int. Conf. on Learning Representations, San Diego, USA, May, 2015, pp. 1-14.
  15. H. Noh, S. Hong, and B. Han, "Learning deconvolution network for semantic segmentation," In Proc. of the IEEE Int. Conf. on Computer Vision, Santiago, CHL, Dec, 2015. pp. 1520-1528.
  16. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, and T. Darrell, "Caffe: An open source convolutional architecture for fast feature embedding," In Proc. of the 22nd ACM Int. Conf. on Multimedia, Orlando, USA, Nov, 2014. pp. 675-678.
  17. Y. Lee and P. Moon, "A Comparison and Analysis of Deep Learning Framework," J. of the Korea Institute of Electronic Communication, vol. 12, no. 1, 2017. pp. 115-122.