DOI QR코드

DOI QR Code

PG-GAN을 이용한 패션이미지 데이터 자동 생성

Automaitc Generation of Fashion Image Dataset by Using Progressive Growing GAN

  • 김양희 (고려대학교 정보대학 컴퓨터학과) ;
  • 이찬희 (고려대학교 정보대학 컴퓨터학과) ;
  • 황태선 (고려대학교 정보대학 컴퓨터학과) ;
  • 김경민 (고려대학교 정보대학 컴퓨터학과) ;
  • 임희석 (고려대학교 정보대학 컴퓨터학과)
  • Kim, Yanghee (Department of Computer Science Engineering, College of Informatics, Korea University) ;
  • Lee, Chanhee (Department of Computer Science Engineering, College of Informatics, Korea University) ;
  • Whang, Taesun (Department of Computer Science Engineering, College of Informatics, Korea University) ;
  • Kim, Gyeongmin (Department of Computer Science Engineering, College of Informatics, Korea University) ;
  • Lim, Heuiseok (Department of Computer Science Engineering, College of Informatics, Korea University)
  • 투고 : 2018.06.24
  • 심사 : 2018.09.16
  • 발행 : 2018.12.31

초록

이미지와 같은 고차원 데이터로부터 새로운 샘플 데이터를 생성하는 기술은 음성 합성, 이미지 변환 및 이미지 복원 등에 다양하게 활용되고 있다. 본 논문은 고해상도의 이미지들을 생성하는 것과 생성한 이미지들의 variation을 높이기 위한 방안으로 Progressive Growing of Generative Adversarial Networks(PG-GANs)을 구현 모델로 채택하였고, 이를 패션 이미지 데이터에 적용하였다. PG-GANs은 생성자(Generator)와 판별자(discriminator)를 동시에 점진적으로 학습하도록 하는데, 저해상도의 이미지에서부터 계속해서 새로운 레이어들을 추가하여 결과적으로 고해상도의 이미지를 생성할 수 있게끔 하는 방식이다. 또한 생성 데이터의 다양성을 높이기 위하여 미니배치 표준편차 방법을 제안하였고 GAN 모델을 평가하기 위한 기존의 MS-SSIM이 아닌 Sliced Wasserstein Distance(SWD) 평가 방법을 제안하였다.

Techniques for generating new sample data from higher dimensional data such as images have been utilized variously for speech synthesis, image conversion and image restoration. This paper adopts Progressive Growing of Generative Adversarial Networks(PG-GANs) as an implementation model to generate high-resolution images and to enhance variation of the generated images, and applied it to fashion image data. PG-GANs allows the generator and discriminator to progressively learn at the same time, continuously adding new layers from low-resolution images to result high-resolution images. We also proposed a Mini-batch Discrimination method to increase the diversity of generated data, and proposed a Sliced Wasserstein Distance(SWD) evaluation method instead of the existing MS-SSIM to evaluate the GAN model.

키워드

참고문헌

  1. Durk.P. Kingma, Shakir Mohamad, Danilo Jimenez Rezende and Max Welling, "Semi-supervised Learning with Deep Generative Models," Advances in Neural Information Processing Systems (NIPS) 27, 2014.
  2. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville and Yoshua Bengio, "Generative Adversarial Nets," Advances in Neural Information Processing Systems (NIPS) 27, 2014.
  3. Tero Karras, Timo Aila, Samuli Laine and Jaakko Lehtinen, "Progressive Growing of GANs for Improved Quality, Stability, and Variation," ICLR, 2018.
  4. Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever and Pieter Abbeel, "Variational Lossy Autoencoder," arXiv.org, 2016.
  5. Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford and Xi Chen, "Improved Techniques for Training GANs," Advances in Neural Information Processing Systems (NIPS) 29, 2016.
  6. Alireza Makhzani and Brendan Frey, "PixelGAN Autoencoders," University of Toronto, 2017.
  7. Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz and William J. Dally, "EIE: Efficient Inference Engine on Compressed Deep Neural Network," 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016.
  8. Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang and Jiaya Jia, "Pyramid Scene Parsing Network," The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2881-2890, 2017.
  9. Martin Arjovsky, Soumith Chintala and Leon Bottou, "Wasserstein Generative Adversarial Networks," Proceedings of the 34th International Conference on Machine Learning (PMLR), Vol.70, pp.214-223, 2017.
  10. Augustus Odena, Christopher Olah1 and Jonathon Shlens1, "Conditional Image Synthesis with Auxiliary Classifier GANs," Proceedings of the 34th International Conference on Machine Learning (ICML'17), Vol.70, pp.2642-2651, 2017.
  11. Peter J. Burt and Edward H. Adelson, "Method for compensating for void-defects in images," US Patent, 1987.
  12. Julien Rabin, Gabriel Peyre, Julie Delon and Marc Bernot, "Wasserstein Barycenter and Its Application to Texture Mixing," International Conference on Scale Space and Variational Methods in Computer Vision (SSVM), pp.435-446, 2011.