DOI QR코드

DOI QR Code

머신러닝 자동화를 위한 개발 환경에 관한 연구

A Study on Development Environments for Machine Learning

  • 투고 : 2020.07.13
  • 심사 : 2020.08.06
  • 발행 : 2020.12.31

초록

Machine learning model data is highly affected by performance. preprocessing is needed to enable analysis of various types of data, such as letters, numbers, and special characters. This paper proposes a development environment that aims to process categorical and continuous data according to the type of missing values in stage 1, implementing the function of selecting the best performing algorithm in stage 2 and automating the process of checking model performance in stage 3. Using this model, machine learning models can be created without prior knowledge of data preprocessing.

키워드

참고문헌

  1. P. S. Jang, "[EU] 2016 Davos Forum," Science and Technology Policy Institute, Vol. 26, No. 2, pp. 12-15, 2016 (in Korean).
  2. A. L. Samuel, "Some Studies in Machine Learning Using the Game of Checkers," IBM Journal of Research and Development, Vol. 3, No. 3, pp. 201-229, 1959. https://doi.org/10.1147/rd.33.0210
  3. S. Z. Cho, S. H. Kang, "Industrial Application of Machine Learning (Artificial Intelligence," Industrial Engineering Magazine, Vol. 23, No. 2, pp. 34-38, 2016 (in Korean).
  4. https://www.gartner.com/en/search?keywords=
  5. https://cloud.google.com/automl/docs?hl=ko
  6. https://docs.microsoft.com/ko-kr/azure/machine-learning/
  7. D.G. Kim, Y.S. Park, L.J. Park, T.Y. Chung, "Developing of New a Tensorflow Tutorial Model on Machine Learning : Focusing on the Kaggle Titanic Dataset," IEMEK J. Embed. Sys. Appl., Vol. 14, No. 4, pp. 207-218, 2019 (in Korean). https://doi.org/10.14372/IEMEK.2019.14.4.207
  8. D.G. Kim, Y.S. Park, T.Y. Chung, "Development of Processing Missing Value for the Code Using Correlation," Journal of the Institute of Embedded Engineering of Korea, 2019 (in Korean).
  9. https://www.kaggle.com/datasets
  10. https://en.wikipedia.org/wiki/Regular_expression
  11. https://en.wikipedia.org/wiki/Dummy_variable_(statistics)
  12. K.H. Kim, B.H. Chang, H.K. Choi, "Deep Learning Based Short-Term Electric Load Forecasting Models Using One-Hot Encoding," Journal of IEEE Korea Council, Vol. 23, No. 3, pp. 852-857, 2019 (in Korean).
  13. Y.S. Tak, J.H. Cheol, J.Y. Jung, "Performance Comparison Among Deep Neural Networks Consisting of Various Constituents," Journal of The Korean Data Analysis Society, Vol. 21, No. 5, pp. 2289-2301, 2019 (in Korean). https://doi.org/10.37727/jkdas.2019.21.5.2289
  14. S.H. Ryu, J.B. Yoon, "The Effect of Regularization and Identity Mapping on the Performance of Activation Functions," Journal of the Korea Academia-Industrial cooperation Society, Vol. 18, No. 10, pp. 75-80, 2017 (in Korean). https://doi.org/10.5762/KAIS.2017.18.1.75
  15. C.S. Hong, J.H. Sung, "Multivariate Skewness and Kurtosis," Journal of the Korean data & information science society, Vol. 29. No. 1, pp. 71-81, 2018 (in Korean). https://doi.org/10.7465/jkdi.2018.29.1.71
  16. H.K. Jong, K.W. Lee, E.S. Cho, "Detecting Outliers on Training Dataset for Better Quality of Estimation on Vessel Traces," KIISE Transactions on Computing Practices, Vol. 25, No. 12, pp. 594-601, 2019 (in Korean). https://doi.org/10.5626/KTCP.2019.25.12.594
  17. Y.G. Lee, "A Study on Book Categorization in Social Sciences Using kNN Classifiers and Table of Contents Text," Journal of the Korean society for information management, Vol. 37, No. 1, pp. 1-21, 2020 (in Korean). https://doi.org/10.3743/KOSIM.2020.37.1.001
  18. S.W. Rhee, H.J. Cho, C.J. Chae, "EEG Signal Classification based on SVM Algorithm," Journal of the Korea Convergence Society, Vol. 11, No. 2, pp. 17-22, 2020 (in Korean).
  19. Y.S. Lee, J.S. Yi, S.H. Kim, "Segmentation of Performing Arts Market: An Application of Latent Class Analysis and Decision Tree Analysis to Infrequent Attendees," Journal of consumer studies, Vol. 31, No. 3, pp. 245-267, 2020 (in Korean). https://doi.org/10.35736/JCS.31.3.11
  20. J.H. Lee, "Machine Learning Applications to Households Insolvency with Imbalanced Data," Journal of consumer studies, Vol. 30, No. 6, pp. 97-118, 2019 (in Korean). https://doi.org/10.35736/JCS.30.6.5
  21. C.G. Park, K.E. Lee, "A Linearity Test Statistic in a Simple Linear Regression," Journal of the Korean data & information science society, Vol. 25, No. 2, pp. 305-315, 2014 (in Korean). https://doi.org/10.7465/jkdi.2014.25.2.305
  22. C.Y. Seo, Y.J. Suh, D.J. Kim, "Study on Fault Detection of a Gas Pressure Regulator Based on Machine Learning Algorithms," Journal of the Korea society of computer and information, Vol. 25, No. 4, pp. 19-27, 2020 (in Korean).
  23. D.W. Hah, Y.M. Kim, J.J. Ahn, "A Study on KOSPI 200 Direction Forecasting Using XGBoost Model," Journal of the Korean Data And Information Science Society, Vol. 30, No. 3, pp. 655-669, 2019 (in Korean). https://doi.org/10.7465/jkdi.2019.30.3.655
  24. W.W, Nam, N.W. Kim, "Deep Learning Based Depression Classification Using Environmental Factor Selection," The transactions of The Korean Institute of Electrical Engineers, Vol. 69, No. 7, pp. 1102-1110, 2020 (in Korean). https://doi.org/10.5370/KIEE.2020.69.7.1102
  25. J.C. Jeong, H.J. Youn, "Region of Interest (ROI) Selection of Land Cover Using SVM Cross Validation," Journal of Cadastre & Land InformatiX, Vol. 50, No. 1, pp.75-85, 2020 (in Korean). https://doi.org/10.22640/LXSIRI.2020.50.1.75
  26. https://en.wikipedia.org/wiki/Accuracy_and_precision
  27. https://en.wikipedia.org/wiki/Precision_and_recall
  28. https://en.wikipedia.org/wiki/F1_score