Recognition of Multi Label Fashion Styles based on Transfer Learning and Graph Convolution Network

전이학습과 그래프 합성곱 신경망 기반의 다중 패션 스타일 인식

  • Kim, Sunghoon (Department of Data Science, Seoul Women's University) ;
  • Choi, Yerim (Department of Data Science, Seoul Women's University) ;
  • Park, Jonghyuk (Department of Industrial Engineering, Seoul National University)
  • Received : 2020.10.07
  • Accepted : 2021.01.28
  • Published : 2021.02.28


Recently, there are increasing attempts to utilize deep learning methodology in the fashion industry. Accordingly, research dealing with various fashion-related problems have been proposed, and superior performances have been achieved. However, the studies for fashion style classification have not reflected the characteristics of the fashion style that one outfit can include multiple styles simultaneously. Therefore, we aim to solve the multi-label classification problem by utilizing the dependencies between the styles. A multi-label recognition model based on a graph convolution network is applied to detect and explore fashion styles' dependencies. Furthermore, we accelerate model training and improve the model's performance through transfer learning. The proposed model was verified by a dataset collected from social network services and outperformed baselines.

최근 패션업계에서는 급속도로 발전하는 딥러닝 방법론을 활용하려는 시도가 늘고 있다. 이에 따라 다양한 패션 관련 문제들을 다루는 연구들이 제안되었고, 우수한 성능을 달성하였다. 하지만 패션 스타일 분류 문제의 경우, 기존 연구들은 한 옷차림이 여러 스타일을 동시에 포함할 수 있다는 패션 스타일의 특성을 반영하지 못하였다. 따라서 본 연구에서는 동시에 존재하는 레이블 간의 종속성을 모델링하고, 이를 반영하여 패션 스타일의 다중 분류 문제를 해결하고자 한다. 패션 스타일 사이의 종속성을 포착하고 탐색하기 위해 GCN(graph convolution network) 기반의 다중 레이블 인식 모델을 적용하였다. 또한 전이학습을 통해 모델의 학습 속도 및 성능을 향상시켰다. 제안하는 모델은 웹 크롤링을 통해 수집한 SNS 이미지 데이터를 이용하여 검증하였으며, 비교 모델 대비 우수한 성능을 기록하였다.



  1. Chen, Z. M., Wei, X. S., Wang, P., and Guo, Y., "Multi-Label Image Recognition with Graph Convolutional Networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5177-5186, 2019.
  2. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K. and Li, F., "Imagenet: A largescale Hierarchical Image Database," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, 2009.
  3. Doughty, H., Damen, D. and Mayol-Cuevas, W., "Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6057-6066, 2018.
  4. Ferreira, B. Q., Costeira, J. P., Sousa, R. G., Gui, L. Y. and Gomes, J. P., "Pose Guided Attention for Multi-Label Fashion Image Classification," Proceedings of the IEEE International Conference on Computer Vision Workshop, pp. 3125-3128, 2019.
  5. Ge, Y., Zhang, R., Wang, X., Tang, X. and Luo, P., "Deepfashion 2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5337-5345, 2019.
  6. Gong, Y., Jia, Y., Leung, T., Toshev, A. and Ioffe, S., "Deep Convolutional Ranking for Multilabel Image Annotation," arXiv preprint arXiv:1312.4894, 2013.
  7. Guo, Y. and Gu, S., "Multi-Label Classification using Conditional Dependency Networks," International Joint Conference on Artificial Intelligence, Vol. 22, No. 1, pp. 1300-1305, 2011.
  8. He, H. and Xia, R., "Joint Binary Neural Network for Multi-Label Learning with Applications to Emotion Classification," International Conference on Natural Language Processing and Chinese Computing, pp. 250-259, 2018.
  9. He, K., Zhang, X., Ren, S. and Sun, J., "Deep Residual Learning for Image Recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
  10. Joachims, T., Swaminathan, A., and Schnabel, T., "Unbiased Learning-to-Rank with Biased Feedback," Proceedings of the ACM International Conference on Web Search and Data Mining, pp. 781-789, 2017.
  11. Kipf, T. N. and Welling, M., "Semi-Supervised Classification with Graph Convolutional Networks," International Conference on Learning Representations, 2016.
  12. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P., "Gradient-Based Learning Applied to Document Recognition," Proceedings of the IEEE, Vol. 86, No. 11, pp. 2278-2324, 1998.
  13. Lee, D. and Kim, K., "A LSTM Based Method for Photovoltaic Power Prediction in Peak Times Without Future Meteorological Information," The Journal of Society for e-Business Studies, Vol. 24, No. 4, pp. 119-133, 2019.
  14. Liu, W., Tsang, I. W. and Muller, K. R., "An Easy-to-Hard Learning Paradigm for Multiple Classes and Multiple Labels," The Journal of Machine Learning Research, Vol. 18, No. 1, pp. 3300-3337, 2017.
  15. Liu, Z., Luo, P., Qiu, S., Wang, X. and Tang, X., "Deepfashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1096-1104, 2016.
  16. Mirzazadeh, F., Ravanbakhsh, S., Ding, N., and Schuurmans, D., "Embedding Inference for Structured Multilabel Prediction," Advances in Neural Information Processing Systems, pp. 3555-3563, 2015.
  17. Oh, S., Lee, H., Shin, J., and Lee, J., "Antibiotics-Resistant Bacteria Infection Prediction Based on Deep Learning," The Journal of Society for e-Business Studies, Vol. 24, No. 1, pp. 105-120, 2019.
  18. Pennington, J., Socher, R. and Manning, C. D., "Glove: Global Vectors for Word Representation," Empirical Methods in Natural Language Processing, pp. 1532-1543, 2014.
  19. Schroff, F., Kalenichenko, D. and Philbin, J., "Facenet: A Unified Embedding for Face Recognition and Clustering," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815-823, 2015.
  20. Shin, S., "Application of Big Data in the Fashion Industry," FashionNet Korea, Retrieved February 28, 2016.
  21. Simonyan, K. and Zisserman, A., "Very Deep Convolutional Networks for Large-scale Image Recognition," arXiv preprint arXiv:1409.1556, 2014.
  22. Szymanski, P., Kajdanowicz, T., and Chawla, N., "LNEMLC: Label Network Embeddings for Multi-Label Classification," arXiv preprint arXiv:1812.02956, 2018.
  23. Takagi, M., Simo-Serra, E., Iizuka, S., and Ishikawa, H., "What Makes a Style: Experimental Analysis of Fashion Prediction," Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 2247-2253, 2017.
  24. Yoo, S. and Jeong, O., "An Intelligent Chatbot Utilizing BERT Model and Knowledge Graph," The Journal of Society for e-Business Studies, Vol. 24, No. 3, pp. 87-98, 2019.