• Title/Summary/Keyword: tree based learning

Search Result 435, Processing Time 0.034 seconds

Evaluation Method of College English Education Effect Based on Improved Decision Tree Algorithm

  • Dou, Fang
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.500-509
    • /
    • 2022
  • With the rapid development of educational informatization, teaching methods become diversified characteristics, but a large number of information data restrict the evaluation on teaching subject and object in terms of the effect of English education. Therefore, this study adopts the concept of incremental learning and eigenvalue interval algorithm to improve the weighted decision tree, and builds an English education effect evaluation model based on association rules. According to the results, the average accuracy of information classification of the improved decision tree algorithm is 96.18%, the classification error rate can be as low as 0.02%, and the anti-fitting performance is good. The classification error rate between the improved decision tree algorithm and the original decision tree does not exceed 1%. The proposed educational evaluation method can effectively provide early warning of academic situation analysis, and improve the teachers' professional skills in an accelerated manner and perfect the education system.

Machine Learning Approach to Blood Stasis Pattern Identification Based on Self-reported Symptoms (기계학습을 적용한 자기보고 증상 기반의 어혈 변증 모델 구축)

  • Kim, Hyunho;Yang, Seung-Bum;Kang, Yeonseok;Park, Young-Bae;Kim, Jae-Hyo
    • Korean Journal of Acupuncture
    • /
    • v.33 no.3
    • /
    • pp.102-113
    • /
    • 2016
  • Objectives : This study is aimed at developing and discussing the prediction model of blood stasis pattern of traditional Korean medicine(TKM) using machine learning algorithms: multiple logistic regression and decision tree model. Methods : First, we reviewed the blood stasis(BS) questionnaires of Korean, Chinese, and Japanese version to make a integrated BS questionnaire of patient-reported outcomes. Through a human subject research, patients-reported BS symptoms data were acquired. Next, experts decisions of 5 Korean medicine doctor were also acquired, and supervised learning models were developed using multiple logistic regression and decision tree. Results : Integrated BS questionnaire with 24 items was developed. Multiple logistic regression models with accuracy of 0.92(male) and 0.95(female) validated by 10-folds cross-validation were constructed. By decision tree modeling methods, male model with 8 decision node and female model with 6 decision node were made. In the both models, symptoms of 'recent physical trauma', 'chest pain', 'numbness', and 'menstrual disorder(female only)' were considered as important factors. Conclusions : Because machine learning, especially supervised learning, can reveal and suggest important or essential factors among the very various symptoms making up a pattern identification, it can be a very useful tool in researching diagnostics of TKM. With a proper patient-reported outcomes or well-structured database, it can also be applied to a pre-screening solutions of healthcare system in Mibyoung stage.

Improving the Performance of Korean Text Chunking by Machine learning Approaches based on Feature Set Selection (자질집합선택 기반의 기계학습을 통한 한국어 기본구 인식의 성능향상)

  • Hwang, Young-Sook;Chung, Hoo-jung;Park, So-Young;Kwak, Young-Jae;Rim, Hae-Chang
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.9
    • /
    • pp.654-668
    • /
    • 2002
  • In this paper, we present an empirical study for improving the Korean text chunking based on machine learning and feature set selection approaches. We focus on two issues: the problem of selecting feature set for Korean chunking, and the problem of alleviating the data sparseness. To select a proper feature set, we use a heuristic method of searching through the space of feature sets using the estimated performance from a machine learning algorithm as a measure of "incremental usefulness" of a particular feature set. Besides, for smoothing the data sparseness, we suggest a method of using a general part-of-speech tag set and selective lexical information under the consideration of Korean language characteristics. Experimental results showed that chunk tags and lexical information within a given context window are important features and spacing unit information is less important than others, which are independent on the machine teaming techniques. Furthermore, using the selective lexical information gives not only a smoothing effect but also the reduction of the feature space than using all of lexical information. Korean text chunking based on the memory-based learning and the decision tree learning with the selected feature space showed the performance of precision/recall of 90.99%/92.52%, and 93.39%/93.41% respectively.

Using an Adaptive Search Tree to Predict User Location

  • Oh, Se-Chang
    • Journal of Information Processing Systems
    • /
    • v.8 no.3
    • /
    • pp.437-444
    • /
    • 2012
  • In this paper, we propose a method for predicting a user's location based on their past movement patterns. There is no restriction on the length of past movement patterns when using this method to predict the current location. For this purpose, a modified search tree has been devised. The search tree is constructed in an effective manner while it additionally learns the movement patterns of a user one by one. In fact, the time complexity of the learning process for a movement pattern is linear. In this process, the search tree expands to take into consideration more details about the movement patterns when a pattern that conflicts with an existing trained pattern is found. In this manner, the search tree is trained to make an exact matching, as needed, for location prediction. In the experiments, the results showed that this method is highly accurate in comparison with more complex and sophisticated methods. Also, the accuracy deviation of users of this method is significantly lower than for any other methods. This means that this method is highly stable for the variations of behavioral patterns as compared to any other method. Finally, 1.47 locations were considered on average for making a prediction with this method. This shows that the prediction process is very efficient.

Development of Medical Cost Prediction Model Based on the Machine Learning Algorithm (머신러닝 알고리즘 기반의 의료비 예측 모델 개발)

  • Han Bi KIM;Dong Hoon HAN
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.11-16
    • /
    • 2023
  • Accurate hospital case modeling and prediction are crucial for efficient healthcare. In this study, we demonstrate the implementation of regression analysis methods in machine learning systems utilizing mathematical statics and machine learning techniques. The developed machine learning model includes Bayesian linear, artificial neural network, decision tree, decision forest, and linear regression analysis models. Through the application of these algorithms, corresponding regression models were constructed and analyzed. The results suggest the potential of leveraging machine learning systems for medical research. The experiment aimed to create an Azure Machine Learning Studio tool for the speedy evaluation of multiple regression models. The tool faciliates the comparision of 5 types of regression models in a unified experiment and presents assessment results with performance metrics. Evaluation of regression machine learning models highlighted the advantages of boosted decision tree regression, and decision forest regression in hospital case prediction. These findings could lay the groundwork for the deliberate development of new directions in medical data processing and decision making. Furthermore, potential avenues for future research may include exploring methods such as clustering, classification, and anomaly detection in healthcare systems.

The Evaluation of a Plastic Material Classification System using Near Field IR (NIR) Spectrum and Decision Tree based Machine Learning (Near Field IR (NIR) 스펙트럼 및 결정 트리 기반 기계학습을 이용한 플라스틱 재질 분류 시스템)

  • Kook, Joongjin
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.3
    • /
    • pp.92-97
    • /
    • 2022
  • Plastics are classified into 7 types such as PET (PETE), HDPE, PVC, LDPE, PP, PS, and Other for separation and recycling. Recently, large corporations advocating ESG management are replacing them with bioplastics. Incineration and landfill of disposal of plastic waste are responsible for air pollution and destruction of the ecosystem. Because it is not easy to accurately classify plastic materials with the naked eye, automated system-based screening studies using various sensor technologies and AI-based software technologies have been conducted. In this paper, NIR scanning devices considering the NIR wavelength characteristics that appear differently for each plastic material and a system that can identify the type of plastic by learning the NIR spectrum data collected through it. The accuracy of plastic material identification was evaluated through a decision tree-based SVM model for multiclass classification on NIR spectral datasets for 8 types of plastic samples including biodegradable plastic.

Improvement of BigCloneBench Using Tree-Based Convolutional Neural Network (트리 기반 컨볼루션 신경망을 이용한 BigCloneBench 개선)

  • Park, Gunwoo;Hong, Sung-Moon;Kim, Hyunha;Doh, Kyung-Goo
    • Journal of Software Assessment and Valuation
    • /
    • v.15 no.1
    • /
    • pp.43-53
    • /
    • 2019
  • BigCloneBench has recently been used for performance evaluation of code clone detection tool using machine learning. However, since BigCloneBench is not a benchmark that is optimized for machine learning, incorrect learning data can be created. In this paper, we have shown through experiments using machine learning that the set of Type-4 clone methods provided by BigCloneBench can additionally be found. Experimental results using Tree-Based Convolutional Neural Network show that our proposed method is effective in improving BigCloneBench's dataset.

Prediction of short-term algal bloom using the M5P model-tree and extreme learning machine

  • Yi, Hye-Suk;Lee, Bomi;Park, Sangyoung;Kwak, Keun-Chang;An, Kwang-Guk
    • Environmental Engineering Research
    • /
    • v.24 no.3
    • /
    • pp.404-411
    • /
    • 2019
  • In this study, we designed a data-driven model to predict chlorophyll-a using M5P model tree and extreme learning machine (ELM). The Juksan weir in the Youngsan River has high chlorophyll-a, which is the primary indicator of algal bloom every year. Short-term algal bloom prediction is important for environmental management and ecological assessment. Two models were developed and evaluated for short-term algal bloom prediction. M5P is a classification and regression-analysis-based method, and ELM is a feed-forward neural network with fast learning using the least square estimate for regression. The dataset used in this study includes water temperature, rainfall, solar radiation, total nitrogen, total phosphorus, N/P ratio, and chlorophyll-a, which were collected on a daily basis from January 2013 to December 2016. The M5P model showed that the prediction model after one day had the highest performance power and dropped off rapidly starting with predictions after three days. Comparing the performance power of the ELM model with the M5P model, it was found that the performance power of the 1-7 d chlorophyll-a prediction model was higher. Moreover, in a period of rapidly increasing algal blooms, the ELM model showed higher accuracy than the M5P model.

Application of machine learning methods for predicting the mechanical properties of rubbercrete

  • Miladirad, Kaveh;Golafshani, Emadaldin Mohammadi;Safehian, Majid;Sarkar, Alireza
    • Advances in concrete construction
    • /
    • v.14 no.1
    • /
    • pp.15-34
    • /
    • 2022
  • The use of waste rubber in concrete can reduce natural aggregate consumption and improve some technical properties of concrete. Although there are several equations for estimating the mechanical properties of concrete containing waste rubber, limited numbers of machine learning-based models have been proposed to predict the mechanical properties of rubbercrete. In this study, an extensive database of the mechanical properties of rubbercrete was gathered from a comprehensive survey of the literature. To model the mechanical properties of rubbercrete, M5P tree and linear gene expression programming (LGEP) methods as two machine learning techniques were employed to achieve reliable mathematical equations. Two procedures of input variable selection were considered in this study. The crucial component ratios of rubbercrete and concrete age were assumed as the input variables in the first procedure. In contrast, the volumes of the coarse and fine waste rubber and the compressive strength of concrete without waste rubber were considered the second procedure of the input variables. The results show that the models obtained by LGEP are more accurate than those achieved by the M5P model tree and existing traditional equations. Besides, the volumes of the coarse and fine waste rubber and the compressive strength of concrete without waste rubber are better predictors of the mechanical properties of rubbercrete compared to the first procedure of input variable selection.

A Machine Learning Approach for Mechanical Motor Fault Diagnosis (기계적 모터 고장진단을 위한 머신러닝 기법)

  • Jung, Hoon;Kim, Ju-Won
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.1
    • /
    • pp.57-64
    • /
    • 2017
  • In order to reduce damages to major railroad components, which have the potential to cause interruptions to railroad services and safety accidents and to generate unnecessary maintenance costs, the development of rolling stock maintenance technology is switching from preventive maintenance based on the inspection period to predictive maintenance technology, led by advanced countries. Furthermore, to enhance trust in accordance with the speedup of system and reduce maintenances cost simultaneously, the demand for fault diagnosis and prognostic health management technology is increasing. The objective of this paper is to propose a highly reliable learning model using various machine learning algorithms that can be applied to critical rolling stock components. This paper presents a model for railway rolling stock component fault diagnosis and conducts a mechanical failure diagnosis of motor components by applying the machine learning technique in order to ensure efficient maintenance support along with a data preprocessing plan for component fault diagnosis. This paper first defines a failure diagnosis model for rolling stock components. Function-based algorithms ANFIS and SMO were used as machine learning techniques for generating the failure diagnosis model. Two tree-based algorithms, RadomForest and CART, were also employed. In order to evaluate the performance of the algorithms to be used for diagnosing failures in motors as a critical railroad component, an experiment was carried out on 2 data sets with different classes (includes 6 classes and 3 class levels). According to the results of the experiment, the random forest algorithm, a tree-based machine learning technique, showed the best performance.