A Study on the System for AI Service Production

Hong, Yong-Geun;

doi:10.3745/KTCCS.2022.11.10.323

정보처리학회논문지:컴퓨터 및 통신 시스템 (KIPS Transactions on Computer and Communication Systems)

제11권10호
/
Pages.323-332
/
2022
/
2287-5891(pISSN)
/
2734-049X(eISSN)

한국정보처리학회 (Korea Information Processing Society)

DOI QR Code

인공지능 서비스 운영을 위한 시스템 측면에서의 연구

A Study on the System for AI Service Production

홍용근 (대전대학교 AI융합학과)

Hong, Yong-Geun

투고 : 2022.04.15
심사 : 2022.05.24
발행 : 2022.10.31

https://doi.org/10.3745/KTCCS.2022.11.10.323 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

AI 기술을 활용한 다양한 서비스가 개발되면서, AI 서비스 운영에 많은 관심이 집중되고 있다. 최근에는 AI 기술도 하나의 ICT 서비스를 보고, 범용적인 AI 서비스 운영을 위한 연구가 많이 진행되고 있다. 본 논문에서는 일반적인 기계학습 개발 절차의 마지막 단계인 기계학습 모델 배포 및 운영에 초점을 두고 AI 서비스 운영을 위한 시스템 측면에서의 연구 결과를 기술하였다. 3대의 서로 다른 Ubuntu 시스템을 구축하고, 이 시스템상에서 서로 다른 AI 모델(RFCN, SSD-Mobilenet)과 서로 다른 통신 방식(gRPC, REST)의 조합으로 2017 validation COCO dataset의 데이터를 이용하여 객체 검출 서비스를 Tensorflow serving을 통하여 AI 서비스를 요청하는 부분과 AI 서비스를 수행하는 부분으로 나누어 실험하였다. 다양한 실험을 통하여 AI 모델의 종류가 AI 머신의 통신 방식보다 AI 서비스 추론 시간에 더 큰 영향을 미치고, 객체 검출 AI 서비스의 경우 검출하려는 이미지의 파일 크기보다는 이미지 내의 객체 개수와 복잡도에 따라 AI 서비스 추론 시간이 더 큰 영향을 받는다는 것을 알 수 있었다. 그리고, AI 서비스를 로컬이 아닌 원격에서 수행하면 성능이 좋은 머신이라고 하더라도 로컬에서 수행하는 경우보다 AI 서비스 추론 시간이 더 걸린다는 것을 확인할 수 있었다. 본 연구 결과를 통하여 서비스 목표에 적합한 시스템 설계와 AI 모델 개발 및 효율적인 AI 서비스 운영이 가능해질 것으로 본다.

As various services using AI technology are being developed, much attention is being paid to AI service production. Recently, AI technology is acknowledged as one of ICT services, a lot of research is being conducted for general-purpose AI service production. In this paper, I describe the research results in terms of systems for AI service production, focusing on the distribution and production of machine learning models, which are the final steps of general machine learning development procedures. Three different Ubuntu systems were built, and experiments were conducted on the system, using data from 2017 validation COCO dataset in combination of different AI models (RFCN, SSD-Mobilenet) and different communication methods (gRPC, REST) to request and perform AI services through Tensorflow serving. Through various experiments, it was found that the type of AI model has a greater influence on AI service inference time than AI machine communication method, and in the case of object detection AI service, the number and complexity of objects in the image are more affected than the file size of the image to be detected. In addition, it was confirmed that if the AI service is performed remotely rather than locally, even if it is a machine with good performance, it takes more time to infer the AI service than if it is performed locally. Through the results of this study, it is expected that system design suitable for service goals, AI model development, and efficient AI service production will be possible.

키워드

과제정보

이 논문은 2021년도 정부(과학기술 정보통신부)의 재원으로 정보통신기획평가원의 지원을 받아 수행된 연구임(2021-0-00188, AI 기능 지원 프레임워크 기반의 이기종 IoT 플랫폼 연동 오픈소스 및 국제 표준 개발).

참고문헌

T. Brown et al., "Language models are few-shot learners," Advances in Neural Information Processing Systems, Vol.33, pp.1877-1901, 2020.
Tensorflow serving [Internet], https://www.tensorflow.org/tfx/guide/serving.
TorchServe [Internet], https://pytorch.org/serve/.
Nvidia Trion Server [Internet], https://developer.nvidia.com/nvidia-triton-inference-server.
Intel OpenVINO [Internet], https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html.
ITU-T Y.3531, "Cloud computing - Functional requirements for machine learning as a service," 2020.
Sungpil Shin, "MLaas(Machine Learning as a Service) Market Trend and Standards for functional requirement," TTA ICT Standard Weekly 1065, 2022.
Flask [Internet], https://flask.palletsprojects.com/en/2.0.x.
Django [Internet], https://www.djangoproject.com/.
FastAPI [Internet], https://fastapi.tiangolo.com/.
H. M. Park and T. H. Hwang, "Changes and trends of Edge computing technology," KICS Information and Communication Magazine, Vol.36, No.2, pp.41-47, 2019.
W. Yu, F. Liang, X. He, W. Grant Hatcher, C. Lu, J. Lin and X. Yang, "A survey on the edge computing for the Internet of Things," IEEE Access, Vol.6, pp.6900-6919, 2017.
S. Maheshwari, D. Raychaudhuri, I. Seskar, and F. Bronzino, "Scalability and performance evaluation of edge cloud systems for latency constrained applications," In 2018 IEEE/ACM Symposium on Edge Computing (SEC), pp. 286-299. IEEE, 2018.
K. H. Kim, Y. G. Hong, and C. S. Pyo, "Standard technology and Trend of Edge computing for IoT and AI," KICS Information and Communication Magazine.
E. H. Kim, K. Ha Lee, and W. Kyung Sung, "Technology trends of deep-learning model lightweight," Communication of KIISE, Vol.38, No.8, pp.18-29, 2020.
F. Wang, M. Zhang, X. Wang, X. Ma, and J. Liu, "Deep learning for edge computing applications: A state-of-the-art survey," IEEE Access, Vol.8, pp.58322-58336, 2020. https://doi.org/10.1109/ACCESS.2020.2982411
Y. Jun Choi and H. S. Eom, "Deep learning model compression for embedded system," KIISE KCC 2019, pp.1044-1046, 2019.
A. G. Howard et al., "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
M. Algabri, H. Mathkour, M. Abdelkader Bencherif, M. Alsulaiman, and M. Amine Mekhtiche, "Towards deep object detection techniques for phoneme recognition," IEEE Access, Vol.8, pp.54663-54680,2020. https://doi.org/10.1109/ACCESS.2020.2980452
R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.580-587, 2014.
R. Girshick, "Fast r-cnn," In Proceedings of the IEEE International Conference on Computer Vision, pp.1440-1448. 2015.
S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems, Vol.28, 2015.
J. Dai, Y. Li, K. He, and J. Sun, "R-fcn: Object detection via region-based fully convolutional networks," Advances in Neural Information Processing Systems, Vol.29, 2016.
S. H. Park, H. S. Yoon, and K. R. Park, "Faster R-CNN and geometric transformation-based detection of driver's eyes using multiple near-infrared camera sensors," Sensors, Vol.19, No.1, pp.197, 2019. https://doi.org/10.3390/s19010197
K. Surya Vara Prasad, K. B. D'souza, and V. K. Bhargava, "A downscaled faster-RCNN framework for signal detection and time-frequency localization in wideband RF systems," IEEE Transactions on Wireless Communications, Vol.19, No.7, pp.4847-4862,2020. https://doi.org/10.1109/TWC.2020.2987990
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.779-788, 2016.
J. Redmon and A. Farhadi, "YOLO9000: Better, faster, stronger," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.7263-7271, 2017.
J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C-Y Fu, and A. C. Berg, "Ssd: Single shot multibox detector," In European Conference on Computer Vision, pp.21-37. Springer, Cham, 2016.
T-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," In Proceedings of the IEEE International Conference on Computer Vision, pp.2980-2988, 2017.
S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li, "Single-shot refinement neural network for object detection," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.4203-4212, 2018.
L. Zhou, W. Min, D. Lin, Q. Han, and R. Liu, "Detecting motion blurred vehicle logo in IoV using filter-DeblurGAN and VL-YOLO," IEEE Transactions on Vehicular Technology, Vol.69, No.4, pp.3604-3614, 2020. https://doi.org/10.1109/TVT.2020.2969427
H. Zhang, L. Qin, J. Li, Y. Guo, Y. Zhou, J. Zhang, and Z. Xu, "Real-time detection method for small traffic signs based on Yolov3," IEEE Access, Vol.8, pp.64145-64156, 2020. https://doi.org/10.1109/ACCESS.2020.2984554
A. G. Howard et al., "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
Intel AI Object Detection [Internet], https://github.com/IntelAI/models/blob/master/docs/object_detection/tensorflow_serving/Tutorial.md.

정보처리학회논문지:컴퓨터 및 통신 시스템 (KIPS Transactions on Computer and Communication Systems)

인공지능 서비스 운영을 위한 시스템 측면에서의 연구

A Study on the System for AI Service Production

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)