DOI QR코드

DOI QR Code

Saliency Attention Method for Salient Object Detection Based on Deep Learning

딥러닝 기반의 돌출 객체 검출을 위한 Saliency Attention 방법

  • 김회준 (광운대학교 플라즈마바이오디스플레이학과) ;
  • 이상훈 (광운대학교 인제니움학부) ;
  • 한현호 (울산대학교 교양대학) ;
  • 김진수 ((주)아이디피 시스템)
  • Received : 2020.10.20
  • Accepted : 2020.12.20
  • Published : 2020.12.28

Abstract

In this paper, we proposed a deep learning-based detection method using Saliency Attention to detect salient objects in images. The salient object detection separates the object where the human eye is focused from the background, and determines the highly relevant part of the image. It is usefully used in various fields such as object tracking, detection, and recognition. Existing deep learning-based methods are mostly Autoencoder structures, and many feature losses occur in encoders that compress and extract features and decoders that decompress and extend the extracted features. These losses cause the salient object area to be lost or detect the background as an object. In the proposed method, Saliency Attention is proposed to reduce the feature loss and suppress the background region in the Autoencoder structure. The influence of the feature values was determined using the ELU activation function, and Attention was performed on the feature values in the normalized negative and positive regions, respectively. Through this Attention method, the background area was suppressed and the projected object area was emphasized. Experimental results showed improved detection results compared to existing deep learning methods.

본 논문에서는 이미지에서 돌출되는 객체를 검출하기 위해 Saliency Attention을 이용한 딥러닝 기반의 검출 방법을 제안하였다. 돌출 객체 검출은 사람의 시선이 집중되는 물체를 배경으로부터 분리시키는 것이며, 이미지에서 관련성이 높은 부분을 결정한다. 객체 추적 및 검출, 인식 등의 다양한 분야에서 유용하게 사용된다. 기존의 딥러닝 기반 방법들은 대부분 오토인코더 구조로, 특징을 압축 및 추출하는 인코더와 추출된 특징을 복원 및 확장하는 디코더에서 많은 특징 손실이 발생한다. 이러한 손실로 돌출 객체 영역에 손실이 발생하거나 배경을 객체로 검출하는 문제가 있다. 제안하는 방법은 오토인코더 구조에서 특징 손실을 감소시키고 배경 영역을 억제하기 위해 Saliency Attention을 제안하였다. ELU 활성화 함수를 이용해 특징 값의 영향력을 결정하며 각각 정규화된 음수 및 양수 영역의 특징값에 Attention을 진행하였다. 제안하는 Attention 기법을 통해 배경 영역을 억제하며 돌출 객체 영역을 강조하였다. 실험 결과에서는 제안하는 방법이 기존 방법과 비교하여 향상된 검출 결과를 보였다.

Keywords

References

  1. H. J. Kim, S. H. Lee & K. B. Kim. (2019). Modified Single Shot Multibox Detector for Fine Object Detection based on Deep Learning. Journal of Advanced Research in Dynamical and Control Systems, 11(7), 1773-1780.
  2. D. W. Lee, S. H. Lee & H. H. Han. (2019). A Study on Super Resolution Method using Encoder-Decoder with Residual Learning. Journal of Advanced Research in Dynamical and Control Systems, 11(7), 2426-2433.
  3. D. I. Kim, S. H. Lee & G. S. Lee. (2019). Automated Colorization Method based on Deep Learning with Stepwise Classification. Journal of Advanced Research in Dynamical and Control Systems, 11(7), 1974-1980.
  4. A. Borji, M. M. Cheng, Q. Hou, H. Jiang, & J. Li. (2014). Salient object detection: A survey. Computational Visual Media, 1-34. https://doi.org/10.1007/s41095-017-0079-3
  5. J. Long, E. Shelhamer & T. Darrell. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3431-3440.
  6. O. Ronneberger, P. Fischer & T. Brox. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, 234-241.
  7. P. Baldi. (2012). Autoencoders, unsupervised learning, and deep architectures. In Proceedings of ICML workshop on unsupervised and transfer learning, 37-49.
  8. H. L. Shen & G. Sun. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7132-7141.
  9. S. Woo, J. Park, J. Y. Lee & I. So Kweon. (2018). Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), 3-19.
  10. S. Estrada, S. Conjeti, M. Ahmad, N. Navab & M. Reuter. (2018). Competition vs. concatenation in skip connections of fully convolutional networks. In International Workshop on Machine Learning in Medical Imaging, 214-222.
  11. A. F. Agarap. (2018). Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375.
  12. D. A. Clevert, T. Unterthiner & S. Hochreiter. (2015). Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289.
  13. H. Zhao, O. Gallo, I. Frosio & J. Kautz. (2015). Loss functions for neural networks for image processing. arXiv preprint arXiv:1511.08861.
  14. T. Chai & R. R. Draxler. (2014). Root mean square error (RMSE) or mean absolute error (MAE)?-Arguments against avoiding RMSE in the literature. Geoscientific model development, 7(3), 1247-1250. https://doi.org/10.5194/gmd-7-1247-2014
  15. D. M. Powers. (2020). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061.