Deepfake Image Detection based on Visual Saliency

Visual Saliency 기반의 딥페이크 이미지 탐지 기법

  • 노하림 (덕성여자대학교 컴퓨터공학전공) ;
  • 유제혁 (덕성여자대학교 데이터사이언스학과)
  • Received : 2024.01.31
  • Accepted : 2024.02.26
  • Published : 2024.02.28

Abstract

'Deepfake' refers to a video synthesis technique that utilizes various artificial intelligence technologies to create highly realistic fake content, causing serious confusion to individuals and society by being used for generating fake news, fraud, malicious impersonation, and more. To address this issue, there is a need for methods to detect malicious images generated by deepfake accurately. In this paper, we extract and analyze saliency features from deepfake and real images, and detect candidate synthesis regions on the images, and finally construct an automatic deepfake detection model by focusing on the extracted features. The proposed saliency feature-based model can be universally applied in situations where deepfake detection is required, such as synthesized images and videos. To demonstrate the performance of our approach, we conducted several experiments that have shown the effectiveness of the deepfake detection task.

딥페이크(Deepfake)란 다양한 인공지능 기술을 활용해 진짜와 같은 가짜를 만드는 영상 합성기술로, 가짜 뉴스 생성, 사기, 악의적인 도용 등에 활용되어 개인과 사회에게 심각한 혼란을 유발시키고 있다. 사회적 문제방지를 위해, 딥페이크로 생성된 이미지를 정교하게 분석하고 탐지하는 방법이 필요하다. 따라서, 본 논문에서는 딥페이크로 생성된 가짜 이미지와 진짜 이미지에서 Saliency 특징을 각각 추출하고 분석하여 합성 후보 영역을 검출하며, 추출된 특징들을 중점으로 학습하여 최종적으로 딥페이크 이미지 탐지 모델을 구축하였다. 제안된 Saliency 기반의 딥페이크 탐지 모델은 합성된 이미지, 동영상 등의 딥페이크 검출 상황에서 공통적으로 사용될 수 있으며, 다양한 비교실험을 통해 본 논문의 제안 방법이 효과적임을 보였다.

Keywords

References

  1. S. J. Nightingale, H. Farid, "AI-synthesized faces are indistinguishable from real faces and more trustworthy," Proceedings of the National Academy of Sciences, Vol. 119, No. 8, e2120481119, Feb, 2022.
  2. I. Korshunova, W. Shi, J. Dambre, and L. Theis, "Fast face-swap using convolutional neural networks," in Proc. of the IEEE International Conference on Computer Vision(ICCV), Venice, Italy, 2017, pp. 3677-3685.
  3. A. A. Maksutov, V. O. Morozov, A. A. Lavrenov, and A. S. Smirnov, "Methods of deepfake detection based on machine learning," in 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), St. Petersburg and Moscow, Russia, 2020, pp. 408-411.
  4. H. Kim, H. Kim, J. Rew and E. Hwang, "FLSNet: Robust facial landmark semantic segmentation," IEEE Access, Vol. 8, pp. 116163-116175, June, 2020. https://doi.org/10.1109/ACCESS.2020.3004359
  5. H. Kim, H. Kim, S. Rho, and E. Hwang, "Augmented EMTCNN: A fast and accurate facial landmark detection network," Applied Sciences, Vol. 10, No. 7, pp. 2253, March, 2020.
  6. Y. Li, X. Yang, P. Sun, H. Qi, and S. Lyu, "Celeb-df: A large-scale challenging dataset for deepfake forensics," in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Seattle, WA, 2020, pp. 3207-3216.
  7. A. Rossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Niesner, "Faceforensics++: Learning to detect manipulated facial images," in Proc. of the IEEE/CVF International Conference on Computer Vision(ICCV), Seoul, Korea, 2019, pp. 1-11.
  8. S. Nam, S. Oh, J. Kang, C. Shin, Y. Jo, Y. Kim, K. Kim, M. Shim, S. Lee, Y. Kim, S. Han, G. Nam, D. Lee, S. Jeon, I. Cho, W. Cho, S. Yang, D. Kim, H. Kang, S. Hwang, and S. Kim, (2019, Jan.). Real and Fake Face Detection [Online], Available: https://www.kaggle.com/datasets/ciplab/real-and-fake-face-detection.
  9. B. Zi, M.Chang, J. Chen, X. Ma, and Y. G. Jiang, "Wilddeepfake: A challenging real-world dataset for deepfake detection," in Proc. of the 28th ACM International Conference on Multimedia, Seattle, USA, 2020, pp. 2382-2390.
  10. L. Li, J. Bao, T. Zhang, H. Yang, D. Chen, F. Wen, and B. Guo, "Face x-ray for more general face forgery detection," in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Seattle, WA, 2020, pp. 5001-5010.
  11. F. Lugstein, S. Baier, G. Bachinger, and A. Uhl, "PRNU-based deepfake detection," in Proc. of the 2021 ACM Workshop on Information Hiding and Multimedia Security(IH&MMSec), New York, USA, 2021, pp. 7-12.
  12. T. Yang, Z. Huang, J. Cao, L. Li, and X. Li, "Deepfake network architecture attribution," in Proc. of the AAAI Conference on Artificial Intelligence(AAAI), Vol. 36, No. 4, pp. 4662-4670, June, 2022.
  13. R. Durall, M. Keuper, F. J. Pfreundt, and J. Keuper, "Unmasking deepfakes with simple features," arXiv preprint arXiv:1911.00686, 2019.
  14. X. Yang, Y. Li, and S. Lyu. "Exposing deep fakes using inconsistent head poses," In ICASSP 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, pp. 8261-8265.
  15. F. Matern, C. Riess, and M. Stamminger, "Exploiting visual artifacts to expose deepfakes and face manipulations," In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), Waikoloa, HI, USA, 2019, pp. 83-92.
  16. C. M. Liy, and L. Y. U. S. InIctuOculi, "Exposing ai created fakevideos by detecting eye blinking," in Proc. of the 2018 IEEE International Workshop on Information Forensics and Security (WIFS), Hong Kong, China, 2018, pp. 11-13.
  17. U. A. Ciftci, I. Demir, and L. Yin, (2020, September). "How do the hearts of deep fakes beat? deep fake source detection via interpreting residuals with biological signals," In 2020 IEEE International Joint Conference on Biometrics (IJCB), Online, 2020, pp. 1-10.
  18. R. Tolosana, R. Vera-Rodriguez, J. Fierrez, A. Morales, and J. Ortega-Garcia, "Deepfakes and beyond: A survey of face manipulation and fake detection," Information Fusion, Vol. 64, pp. 131-148, Dec, 2020. https://doi.org/10.1016/j.inffus.2020.06.014
  19. R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, "SLIC superpixels compared to state-of-the-art superpixel methods," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, No. 11, pp. 2274-2282, 2012. https://doi.org/10.1109/TPAMI.2012.120
  20. G. Li, and Y. Yu, "Visual saliency based on multiscale deep features," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 5455-5463.
  21. M. D. Zeiler and R. Fergus, "Visualizing and Understanding Convolutional Networks," in Proc. of the European Conference on Computer Vision(ECCV), Zurich, Switzerland, 2014, pp. 818-833.
  22. S. Suwarno, and K. Kevin, "Analysis of face recognition algorithm: Dlib and opencv," Journal of Informatics and Telecommunication Engineering, Vol. 4, No. 1, pp. 173-184, 2020. https://doi.org/10.31289/jite.v4i1.3865
  23. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, "Grad-cam: Visual explanations from deep networks via gradient-based localization," in Proc. of the IEEE International Conference on Computer Vision(ICCV), Venice, Italy, 2017, pp. 618-626.