• Title/Summary/Keyword: tree based learning

Search Result 435, Processing Time 0.028 seconds

Exploring the Factors Influencing Students' Career Maturity in Seoul City Middle School: A Machine Learning (머신러닝을 활용한 서울시 중학생 진로성숙도 예측 요인 탐색)

  • Park, Jung
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.155-170
    • /
    • 2020
  • The purpose of this study was to apply machine learning techniques (Decision Tree, Random Forest, XGBoost) to data from the 4th~6th year of the Seoul Education Longitudinal Study to find the factors predicting the career maturity of middle school students in Seoul city. In order to evaluate the machine learning application result, the performance of the model according to the indicators was checked. In addition, the model was analyzed using the XGBoostExplainer package, and R and R Studio tools were used for this study. As a result, there was a slight difference in the ranking of variable importance by each model, but the rankings were high in 'Achievement goal awareness', 'Creativity', 'Self-concept', 'Relationship with parents and children', and 'Resilience'. In addition, using the XGBoostExplainer package, it was found that the factors that protect and deteriorate career maturity by panel and 'Achievement goal awareness' is the top priority factor for predicting career maturity. Based on the results of this study, it was suggested that a comparative study of machine learning and variable selection methods and a comparative study of each cohort of the Seoul Education Termination Study should be conducted.

Meltdown Threat Dynamic Detection Mechanism using Decision-Tree based Machine Learning Method (의사결정트리 기반 머신러닝 기법을 적용한 멜트다운 취약점 동적 탐지 메커니즘)

  • Lee, Jae-Kyu;Lee, Hyung-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.6
    • /
    • pp.209-215
    • /
    • 2018
  • In this paper, we propose a method to detect and block Meltdown malicious code which is increasing rapidly using dynamic sandbox tool. Although some patches are available for the vulnerability of Meltdown attack, patches are not applied intentionally due to the performance degradation of the system. Therefore, we propose a method to overcome the limitation of existing signature detection method by using machine learning method for infrastructures without active patches. First, to understand the principle of meltdown, we analyze operating system driving methods such as virtual memory, memory privilege check, pipelining and guessing execution, and CPU cache. And then, we extracted data by using Linux strace tool for detecting Meltdown malware. Finally, we implemented a decision tree based dynamic detection mechanism to identify the meltdown malicious code efficiently.

Development of a Medial Care Cost Prediction Model for Cancer Patients Using Case-Based Reasoning (사례기반 추론을 이용한 암 환자 진료비 예측 모형의 개발)

  • Chung, Suk-Hoon;Suh, Yong-Moo
    • Asia pacific journal of information systems
    • /
    • v.16 no.2
    • /
    • pp.69-84
    • /
    • 2006
  • Importance of Today's diffusion of integrated hospital information systems is that various and huge amount of data is being accumulated in their database systems. Many researchers have studied utilizing such hospital data. While most researches were conducted mainly for medical diagnosis, there have been insufficient studies to develop medical care cost prediction model, especially using machine learning techniques. In this research, therefore, we built a medical care cost prediction model for cancer patients using CBR (Case-Based Reasoning), one of the machine learning techniques. Its performance was compared with those of Neural Networks and Decision Tree models. As a result of the experiment, the CBR prediction model was shown to be the best in general with respect to error rate and linearity between real values and predicted values. It is believed that the medical care cost prediction model can be utilized for the effective management of limited resources in hospitals.

A Novel Red Apple Detection Algorithm Based on AdaBoost Learning

  • Kim, Donggi;Choi, Hongchul;Choi, Jaehoon;Yoo, Seong Joon;Han, Dongil
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.4
    • /
    • pp.265-271
    • /
    • 2015
  • This study proposes an algorithm for recognizing apple trees in images and detecting apples to measure the number of apples on the trees. The proposed algorithm explores whether there are apple trees or not based on the number of image block-unit edges, and then it detects apple areas. In order to extract colors appropriate for apple areas, the CIE $L^*a^*b^*$ color space is used. In order to extract apple characteristics strong against illumination changes, modified census transform (MCT) is used. Then, using the AdaBoost learning algorithm, characteristics data on the apples are learned and generated. With the generated data, the detection of apple areas is made. The proposed algorithm has a higher detection rate than existing pixel-based image processing algorithms and minimizes false detection.

An Improved Text Classification Method for Sentiment Classification

  • Wang, Guangxing;Shin, Seong Yoon
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.1
    • /
    • pp.41-48
    • /
    • 2019
  • In recent years, sentiment analysis research has become popular. The research results of sentiment analysis have achieved remarkable results in practical applications, such as in Amazon's book recommendation system and the North American movie box office evaluation system. Analyzing big data based on user preferences and evaluations and recommending hot-selling books and hot-rated movies to users in a targeted manner greatly improve book sales and attendance rate in movies [1, 2]. However, traditional machine learning-based sentiment analysis methods such as the Classification and Regression Tree (CART), Support Vector Machine (SVM), and k-nearest neighbor classification (kNN) had performed poorly in accuracy. In this paper, an improved kNN classification method is proposed. Through the improved method and normalizing of data, the purpose of improving accuracy is achieved. Subsequently, the three classification algorithms and the improved algorithm were compared based on experimental data. Experiments show that the improved method performs best in the kNN classification method, with an accuracy rate of 11.5% and a precision rate of 20.3%.

Method of Analyzing Important Variables using Machine Learning-based Golf Putting Direction Prediction Model (머신러닝 기반 골프 퍼팅 방향 예측 모델을 활용한 중요 변수 분석 방법론)

  • Kim, Yeon Ho;Cho, Seung Hyun;Jung, Hae Ryun;Lee, Ki Kwang
    • Korean Journal of Applied Biomechanics
    • /
    • v.32 no.1
    • /
    • pp.1-8
    • /
    • 2022
  • Objective: This study proposes a methodology to analyze important variables that have a significant impact on the putting direction prediction using a machine learning-based putting direction prediction model trained with IMU sensor data. Method: Putting data were collected using an IMU sensor measuring 12 variables from 6 adult males in their 20s at K University who had no golf experience. The data was preprocessed so that it could be applied to machine learning, and a model was built using five machine learning algorithms. Finally, by comparing the performance of the built models, the model with the highest performance was selected as the proposed model, and then 12 variables of the IMU sensor were applied one by one to analyze important variables affecting the learning performance. Results: As a result of comparing the performance of five machine learning algorithms (K-NN, Naive Bayes, Decision Tree, Random Forest, and Light GBM), the prediction accuracy of the Light GBM-based prediction model was higher than that of other algorithms. Using the Light GBM algorithm, which had excellent performance, an experiment was performed to rank the importance of variables that affect the direction prediction of the model. Conclusion: Among the five machine learning algorithms, the algorithm that best predicts the putting direction was the Light GBM algorithm. When the model predicted the putting direction, the variable that had the greatest influence was the left-right inclination (Roll).

Comparing automated and non-automated machine learning for autism spectrum disorders classification using facial images

  • Elshoky, Basma Ramdan Gamal;Younis, Eman M.G.;Ali, Abdelmgeid Amin;Ibrahim, Osman Ali Sadek
    • ETRI Journal
    • /
    • v.44 no.4
    • /
    • pp.613-623
    • /
    • 2022
  • Autism spectrum disorder (ASD) is a developmental disorder associated with cognitive and neurobehavioral disorders. It affects the person's behavior and performance. Autism affects verbal and non-verbal communication in social interactions. Early screening and diagnosis of ASD are essential and helpful for early educational planning and treatment, the provision of family support, and for providing appropriate medical support for the child on time. Thus, developing automated methods for diagnosing ASD is becoming an essential need. Herein, we investigate using various machine learning methods to build predictive models for diagnosing ASD in children using facial images. To achieve this, we used an autistic children dataset containing 2936 facial images of children with autism and typical children. In application, we used classical machine learning methods, such as support vector machine and random forest. In addition to using deep-learning methods, we used a state-of-the-art method, that is, automated machine learning (AutoML). We compared the results obtained from the existing techniques. Consequently, we obtained that AutoML achieved the highest performance of approximately 96% accuracy via the Hyperpot and tree-based pipeline optimization tool optimization. Furthermore, AutoML methods enabled us to easily find the best parameter settings without any human efforts for feature engineering.

Comparative Analysis of Intrusion Detection Attack Based on Machine Learning Classifiers

  • Surafel Mehari;Anuja Kumar Acharya
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.10
    • /
    • pp.115-124
    • /
    • 2024
  • In current day information transmitted from one place to another by using network communication technology. Due to such transmission of information, networking system required a high security environment. The main strategy to secure this environment is to correctly identify the packet and detect if the packet contain a malicious and any illegal activity happened in network environments. To accomplish this we use intrusion detection system (IDS). Intrusion detection is a security technology that design detects and automatically alert or notify to a responsible person. However, creating an efficient Intrusion Detection System face a number of challenges. These challenges are false detection and the data contain high number of features. Currently many researchers use machine learning techniques to overcome the limitation of intrusion detection and increase the efficiency of intrusion detection for correctly identify the packet either the packet is normal or malicious. Many machine-learning techniques use in intrusion detection. However, the question is which machine learning classifiers has been potentially to address intrusion detection issue in network security environment. Choosing the appropriate machine learning techniques required to improve the accuracy of intrusion detection system. In this work, three machine learning classifier are analyzed. Support vector Machine, Naïve Bayes Classifier and K-Nearest Neighbor classifiers. These algorithms tested using NSL KDD dataset by using the combination of Chi square and Extra Tree feature selection method and Python used to implement, analyze and evaluate the classifiers. Experimental result show that K-Nearest Neighbor classifiers outperform the method in categorizing the packet either is normal or malicious.

Design of a Mirror for Fragrance Recommendation based on Personal Emotion Analysis (개인의 감성 분석 기반 향 추천 미러 설계)

  • Hyeonji Kim;Yoosoo Oh
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.4
    • /
    • pp.11-19
    • /
    • 2023
  • The paper proposes a smart mirror system that recommends fragrances based on user emotion analysis. This paper combines natural language processing techniques such as embedding techniques (CounterVectorizer and TF-IDF) and machine learning classification models (DecisionTree, SVM, RandomForest, SGD Classifier) to build a model and compares the results. After the comparison, the paper constructs a personal emotion-based fragrance recommendation mirror model based on the SVM and word embedding pipeline-based emotion classifier model with the highest performance. The proposed system implements a personalized fragrance recommendation mirror based on emotion analysis, providing web services using the Flask web framework. This paper uses the Google Speech Cloud API to recognize users' voices and use speech-to-text (STT) to convert voice-transcribed text data. The proposed system provides users with information about weather, humidity, location, quotes, time, and schedule management.

A Study on University Big Data-based Student Employment Roadmap Recommendation (대학 빅데이터 기반 학생 취업 로드맵 추천에 관한 연구)

  • Park, Sangsung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.17 no.3
    • /
    • pp.1-7
    • /
    • 2021
  • The number of new students at many domestic universities is declining. In particular, private universities, which are highly dependent on tuition, are experiencing a crisis of existence. Amid the declining school-age population, universities are striving to fill new students by improving the quality of education and increasing the student employment rate. Recently, there is an increasing number of cases of using the accumulated big data of universities to prepare measures to fill new students. A representative example of this is the analysis of factors that affect student employment. Existing employment-influencing factor analysis studies have applied quantitative models such as regression analysis to university big data. However, since the factors affecting employment differ by major, it is necessary to reflect this. In this paper, the factors affecting employment by major are analyzed using the data of University C and the decision tree model. In addition, based on the analysis results, a roadmap for student employment by major is recommended. As a result of the experiment, four decision tree models were constructed for each major, and factors affecting employment by major and roadmap were derived.