DOI QR코드

DOI QR Code

Dog-Species Classification through CycleGAN and Standard Data Augmentation

  • Chan, Park (Dept. of Computer Science and Engineering, Hoseo University) ;
  • Nammee, Moon (Dept. of Computer Science and Engineering, Hoseo University)
  • Received : 2022.01.21
  • Accepted : 2022.03.25
  • Published : 2023.02.28

Abstract

In the image field, data augmentation refers to increasing the amount of data through an editing method such as rotating or cropping a photo. In this study, a generative adversarial network (GAN) image was created using CycleGAN, and various colors of dogs were reflected through data augmentation. In particular, dog data from the Stanford Dogs Dataset and Oxford-IIIT Pet Dataset were used, and 10 breeds of dog, corresponding to 300 images each, were selected. Subsequently, a GAN image was generated using CycleGAN, and four learning groups were established: 2,000 original photos (group I); 2,000 original photos + 1,000 GAN images (group II); 3,000 original photos (group III); and 3,000 original photos + 1,000 GAN images (group IV). The amount of data in each learning group was augmented using existing data augmentation methods such as rotating, cropping, erasing, and distorting. The augmented photo data were used to train the MobileNet_v3_Large, ResNet-152, InceptionResNet_v2, and NASNet_Large frameworks to evaluate the classification accuracy and loss. The top-3 accuracy for each deep neural network model was as follows: MobileNet_v3_Large of 86.4% (group I), 85.4% (group II), 90.4% (group III), and 89.2% (group IV); ResNet-152 of 82.4% (group I), 83.7% (group II), 84.7% (group III), and 84.9% (group IV); InceptionResNet_v2 of 90.7% (group I), 88.4% (group II), 93.3% (group III), and 93.1% (group IV); and NASNet_Large of 85% (group I), 88.1% (group II), 91.8% (group III), and 92% (group IV). The InceptionResNet_v2 model exhibited the highest image classification accuracy, and the NASNet_Large model exhibited the highest increase in the accuracy owing to data augmentation.

Keywords

Acknowledgement

This research is supported by a National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2021R1A2C2011966).

References

  1. A. A. M. Al-Saffar, H. Tao, and M. A. Talab, "Review of deep convolution neural network in image classification," in Proceedings of 2017 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET), Jakarta, Indonesia, 2017, pp. 26-31.
  2. D. Arezki and H. Fizazi, "Alsat-2B/Sentinel-2 imagery classification using the hybrid pigeon inspired optimization algorithm," Journal of Information Processing Systems, vol. 17, no. 4, pp. 690-706, 2021. https://doi.org/10.3745/JIPS.02.0158
  3. T. Akram, H. M. J. Lodhi, S. R. Naqvi, S. Naeem, M. Alhaisoni, M. Ali, S. A. Haider, and N. N. Qadri, "A multilevel features selection framework for skin lesion classification," Human-centric Computing and Information Sciences, vol. 10, article no. 12, 2020. https://doi.org/10.1186/s13673-020-00216-y
  4. H. Liu, "Animal image classification recognition based on transfer learning," International Core Journal of Engineering, vol. 7, no. 8, pp. 135-140, 2021.
  5. A. Vecvanags, K. Aktas, I. Pavlovs, E. Avots, J. Filipovs, A. Brauns, G. Done, D. Jakovels, and G. Anbarjafari, "Ungulate detection and species classification from camera trap images using RetinaNet and Faster R-CNN," Entropy, vol. 24, no. 3, article no. 353, 2022. https://doi.org/10.3390/e24030353
  6. M. Hu and F. You, "Research on animal image classification based on transfer learning," in Proceedings of the 2020 4th International Conference on Electronic Information Technology and Computer Engineering, Xiamen, China, 2020, pp. 756-761.
  7. S. Schneider, S. Greenberg, G. W. Taylor, and S. C. Kremer, "Three critical factors affecting automated image species recognition performance for camera traps," Ecology and Evolution, vol. 10, no. 7, pp. 3503-3517, 2020. https://doi.org/10.1002/ece3.6147
  8. Y. Guo, T. A. Rothfus, A. S. Ashour, L. Si, C. Du, and T. F. Ting, "Varied channels region proposal and classification network for wildlife image classification under complex environment," IET Image Processing, vol. 14, no. 4, pp. 585-591, 2020. https://doi.org/10.1049/iet-ipr.2019.1042
  9. L. Zhao, Y. Zhang, and Y. Cui, "A multi-scale U-shaped attention network-based GAN method for single image dehazing," Human-centric Computing and Information Sciences, vol. 11, article no. 38, 2021. https://doi.org/10.22967/HCIS.2021.11.038
  10. C. O. Ancuti, C. Ancuti, and R. Timofte, "NH-HAZE: an image dehazing benchmark with non-homogeneous hazy and haze-free images," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, 2020, pp. 1798-1805.
  11. A. Antoniou, A. Storkey, and H. Edwards, "Data augmentation generative adversarial networks," 2017 [Online]. Available: https://arxiv.org/abs/1711.04340.
  12. L. Perez and J. Wang, "The effectiveness of data augmentation in image classification using deep learning," 2017 [Online]. Available: https://arxiv.org/abs/1712.04621.
  13. J. Cho and N. Moon, "Design of image generation system for DCGAN-based kids' book text," Journal of Information Processing Systems, vol. 16, no. 6, pp. 1437-1446, 2020.
  14. A. S. B. Reddy and D. S. Juliet, "Transfer learning with ResNet-50 for malaria cell-image classification," in Proceedings of 2019 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 2019, pp. 945-949.
  15. S. Phiphiphatphaisit and O. Surinta, "Food image classification with improved MobileNet architecture and data augmentation," in Proceedings of the 3rd International Conference on Information Science and Systems, Cambridge, UK, 2020, pp. 51-56.
  16. A. Adedoja, P. A. Owolawi, and T. Mapayi, "Deep learning based on NASNet for plant disease recognition using leave images," in Proceedings of 2019 International Conference on Advances in Big Data, Computing and Data Communication Systems (icABCD), Winterton, South Africa, 2019, pp. 1-5.
  17. A. Howard, M. Sandler, G. Chu, L. C. Chen, B. Chen, M. Tan, M., ... & Adam, H. (2019). "Searching for MobileNetV3," in Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, South Korea, 2019, pp. 1314-1324.
  18. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 770-778.
  19. C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, "Inception-v4, Inception-ResNet and the impact of residual connections on learning," in Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, CA, 2017, pp. 4278, 4284.
  20. B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, "Learning transferable architectures for scalable image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 8697-8710.
  21. W. Wang, Y. Yang, X. Wang, W. Wang, and J. Li, "Development of convolutional neural network and its application in image classification: a survey," Optical Engineering, vol. 58, no. 4, article no. 040901, 2019. https://doi.org/10.1117/1.OE.58.4.040901
  22. S. Bianco, R. Cadene, L. Celona, and P. Napoletano, "Benchmark analysis of representative deep neural network architectures," IEEE Access, vol. 6, pp. 64270-64277, 2018. https://doi.org/10.1109/access.2018.2877890
  23. N. E. Khalifa, M. Loey, and S. Mirjalili, "A comprehensive survey of recent trends in deep learning for digital images augmentation," Artificial Intelligence Review, vol. 55, pp. 2351-2377, 2022. https://doi.org/10.1007/s10462-021-10066-4
  24. C. Shorten and T. M. Khoshgoftaar, "A survey on image data augmentation for deep learning," Journal of Big Data, vol. 6, article no. 60, 2019. https://doi.org/10.1186/s40537-019-0197-0
  25. R. Takahashi, T. Matsubara, and K. Uehara, "Data augmentation using random image cropping and patching for deep CNNs," IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 9, pp. 2917-2931, 2019. https://doi.org/10.1109/tcsvt.2019.2935128
  26. Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, "Random erasing data augmentation," in Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, NY, 2020, pp. 13001-13008.
  27. A. Jabbar, X. Li, and B. Omar, "A survey on generative adversarial networks: variants, applications, and training," ACM Computing Surveys, vol. 54, no. 8, article no. 157, 2022. https://doi.org/10.1145/3463475
  28. C. Han, L. Rundo, R. Araki, Y. Nagano, Y. Furukawa, G. Mauri, H. Nakayama, and H. Hayashi, "Combining noise-to-image and image-to-image GANs: Brain MR image augmentation for tumor detection," IEEE Access, vol. 7, pp. 156966-156977, 2019. https://doi.org/10.1109/ACCESS.2019.2947606
  29. D. H. Lee, Y. Li, and B. S. Shin, "Generalization of intensity distribution of medical images using GANs," Human-centric Computing and Information Sciences, vol. 10, article no. 17, 2020. https://doi.org/10.1186/s13673-020-00220-2
  30. P. Isola, J. Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu. HI, 2017, pp. 5967-5976.
  31. J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks," in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 2242-2251.
  32. W. Li, L. Fan, Z. Wang, C. Ma, and X. Cui, "Tackling mode collapse in multi-generator GANs with orthogonal vectors," Pattern Recognition, vol. 110, article no. 107646, 2021. https://doi.org/10.1016/j.patcog.2020.107646
  33. H. De Meulemeester, J. Schreurs, M. Fanuel, B. De Moor, and J. A. Suykens, "The Bures metric for generative adversarial networks," in Machine Learning and Knowledge Discovery in Databases: Research Track. Cham, Switzerland: Springer, 2021, pp. 52-66.
  34. M. D. Bloice, C. Stocker, and A. Holzinger, "Augmentor: an image augmentation library for machine learning," 2017 [Online]. Available: https://arxiv.org/abs/1708.04680.