DOI QR코드

DOI QR Code

A Reconstruction of Classification for Iris Species Using Euclidean Distance Based on a Machine Learning

머신러닝 기반 유클리드 거리를 이용한 붓꽃 품종 분류 재구성

  • Nam, Soo-Tai (Institute of General Education, Pusan National University) ;
  • Shin, Seong-Yoon (School of Computer Information & Communication Engineering, Kunsan National University) ;
  • Jin, Chan-Yong (Division of Information & Electronic Commerce, Wonkwang University)
  • Received : 2019.09.30
  • Accepted : 2019.11.02
  • Published : 2020.02.29

Abstract

Machine learning is an algorithm which learns a computer based on the data so that the computer can identify the trend of the data and predict the output of new input data. Machine learning can be classified into supervised learning, unsupervised learning, and reinforcement learning. Supervised learning is a way of learning a machine with given label of data. In other words, a method of inferring a function of the system through a pair of data and a label is used to predict a result using a function inferred about new input data. If the predicted value is continuous, regression analysis is used. If the predicted value is discrete, it is used as a classification. A result of analysis, no. 8 (5, 3.4, setosa), 27 (5, 3.4, setosa), 41 (5, 3.5, setosa), 44 (5, 3.5, setosa) and 40 (5.1, 3.4, setosa) in Table 3 were classified as the most similar Iris flower. Therefore, theoretical practical are suggested.

기계학습은 데이터를 기반으로 한 컴퓨터를 학습시켜 컴퓨터 스스로 데이터의 경향성을 파악하게 하여 새로운 입력 데이터의 출력을 예측하도록 하는 알고리즘이다. 기계학습은 크게 지도학습, 비지도학습, 강화학습으로 나눌 수 있다. 지도학습은 데이터에 대한 레이블이 주어진 상태로 기계를 학습시키는 방법이다. 즉, 데이터 및 레이블의 쌍을 통해 해당 시스템의 함수를 추론하는 방법으로 새로운 입력 데이터에 대해서 추론한 함수를 이용하여 결과를 예측한다. 그리고 예측하는 결과 값이 연속 값이면 회귀분석, 예측하는 결과 값이 이산 값이면 분류로 사용된다. 새로운 붓꽃 데이터 Sepal length(5.01)과 Sepal width(3.43)을 이용하여 기초 데이터와 유클리드 거리를 분석하였다. 분석결과, 테이블 3의 8번(5, 3.4, setosa), 27번(5, 3.4, setosa), 41번(5, 3.5, setosa), 44번(5, 3.5, setosa) 그리고 40번(5.1, 3.4, setosa)의 데이터 순으로 유사도가 높은 붓꽃으로 분류되었다. 따라서 이론적 실무적 시사점을 제시하였다.

Keywords

References

  1. S. Cho, D, Jung, S, Lee, M, Shin, and H. Park "Survey on Machine Learning Algorithms for SDN/NFV Automation," The Journal of Korean Institute of Communications and Information Sciences, vol. 44, no. 1. Jan. 2019.
  2. J. R. Quinlan, "Induction of Decision Trees," Machine Learning, vol. 1, no, 1, pp. 81-106, Mar. 1986. https://doi.org/10.1007/BF00116251
  3. M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, "Support vector machines," IEEE Intelligent Systems and their Applications, vol. 13, no. 4, pp. 18-28, Jul. 1998. https://doi.org/10.1109/5254.708428
  4. J. A. Hartigan, and M. A. Wong, "Algorithm AS 136: A k-means clustering algorithm," Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 28, no. 1, pp. 100-108, Jan. 1979.
  5. Wiley Online Library, The Use of Multiple Measurements in Taxonomic Problems [Internet]. Available: https://doi.org/10.1111/j.1469-18091936.t-b02137.x.
  6. S. Y. Shin, and H. C. Lee, "Realistic Enhancement of 3D Expressions for Building Expressions with Hologram," Journal of the Korea Institute of Information & Communication Engineering, vol. 23, no. 09, pp. 1104-1109, Sep. 2019.
  7. H. M. Lee, and S. Y. Shin, "Design of The Wearable Device considering ICT-based Silver-care," Journal of the Korea Institute of Information & Communication Engineering, vol. 22, no. 10, pp. 1347-1354, Oct. 2018. https://doi.org/10.6109/JKIICE.2018.22.10.1347
  8. S. P. Kim, and J. M. Kim, "A Study on Open Source Software Business Model based on Value," Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology, vol. 7, no. 2, pp. 237-244, Feb. 2017. https://doi.org/10.14257/AJMAHS.2017.02.02