• Title/Summary/Keyword: Decision Tree Regression

Search Result 328, Processing Time 0.025 seconds

Exploring Industrial Function Combining Factors for Each Type in the 6th Industry Based on Decision Tree Analysis (의사결정나무분석법을 활용한 6차산업 유형별 산업적 기능결합 요인탐색)

  • Kim, Jungtae
    • Journal of Agricultural Extension & Community Development
    • /
    • v.23 no.3
    • /
    • pp.243-255
    • /
    • 2016
  • This study aims to identify the characteristics of businesses influencing the choice of their type in the 6th industry and analyze how they work. This study analyzed data of 752 businesses certified as belonging to the 6th industry in 2015 through the classification and regression tree (CART) algorithm in decision tree analysis. The results of analysis showed that the type of agricultural product processing, region, the type of service, and the production percentage in a province affected a choice of the type. The most important variable that impacted how businesses in the 6th industry chose their type was the type of agricultural product processing, and if a business produced simple agricultural products, it was likely to specialize into $1st^*2nd$ or $1st^*3rd$. Access to large consumption areas was a critical factor in the growth of 2nd and 3rd industrial functions. These findings would contribute to establishing a model to develop the 6th industry and empirically demonstrate the importance of access to large consumption areas for agricultural businesses and rural tourism.

A Study on Developing a Predictive Model for Digital Quality Management Based on Decision Tree (의사결정나무 기반 디지털품질경영 예측 모형 개발 연구)

  • Byung-Hoon Park;Ho-Jun Song;Wan-Seon Shin
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.47 no.3
    • /
    • pp.51-67
    • /
    • 2024
  • This study aims to develop a comprehensive predictive model for Digital Quality Management (DQM) and to analyze the impact of various quality activities on different levels of DQM. By employing the Classification And Regression Tree (CART) methodology, we are able to present predictive scenarios that elucidate how varying quantitative levels of quality activities influence the five major categories of DQM. The findings reveal that the operation level of quality circles and the promotion level of suggestion systems are pivotal in enhancing DQM levels. Furthermore, the study emphasizes that an effective reward system is crucial to maximizing the effectiveness of these quality activities. Through a quantitative approach, this study demonstrates that for ventures and small-medium enterprises, expanding suggestion systems and implementing robust reward mechanisms can significantly improve DQM levels, particularly when the operation of quality circles is challenging. The research provides valuable insights, indicating that even in the absence of fully operational quality circles, other mechanisms can still drive substantial improvements in DQM. These results are particularly relevant in the context of digital transformation, offering practical guidelines for enterprises to establish and refine their quality management strategies. By focusing on suggestion systems and rewards, businesses can effectively navigate the complexities of digital transformation and achieve higher levels of quality management.

A study on the optimum cutter spacing ratio according to penetration depth using decision tree-based and SVM regressions (의사결정나무 기반 회귀분석과 SVM 회귀분석을 이용한 커터 관입깊이에 따른 최적 커터간격 비 연구)

  • Lee, Gi-Jun;Ryu, Hee-Hwan;Kwon, Tae-Hyuk
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.22 no.5
    • /
    • pp.501-513
    • /
    • 2020
  • Cutter cutting tests for the cutter placement in the cutter head are being conducted through various studies. Although the cutter spacing at the minimum specific energy is mainly reflected in the cutter head design, since the optimum cutter spacing at the same cutter penetration depth varies depending on the rock conditions, studies on deciding the optimum cutter spacing should be actively conducted. The machine learning techniques such as the decision tree-based regression model and the SVM regression model were applied to predict the optimum cutter spacing ratio for the nonlinear relationship between cutter penetration depth and cutter spacing. Since the decision tree-based methods are greatly influenced by the number of data, SVM regression predicted optimum cutter spacing ratio according to the penetration depth more accurately and it is judged that the SVM regression will be effectively used to decide the cutter spacing when designing the cutter head if a large amount of data of the optimum cutter spacing ratio according to the penetration depth is accumulated.

Data Mining for Knowledge Management in a Health Insurance Domain

  • Chae, Young-Moon;Ho, Seung-Hee;Cho, Kyoung-Won;Lee, Dong-Ha;Ji, Sun-Ha
    • Journal of Intelligence and Information Systems
    • /
    • v.6 no.1
    • /
    • pp.73-82
    • /
    • 2000
  • This study examined the characteristicso f the knowledge discovery and data mining algorithms to demonstrate how they can be used to predict health outcomes and provide policy information for hypertension management using the Korea Medical Insurance Corporation database. Specifically this study validated the predictive power of data mining algorithms by comparing the performance of logistic regression and two decision tree algorithms CHAID (Chi-squared Automatic Interaction Detection) and C5.0 (a variant of C4.5) since logistic regression has assumed a major position in the healthcare field as a method for predicting or classifying health outcomes based on the specific characteristics of each individual case. This comparison was performed using the test set of 4,588 beneficiaries and the training set of 13,689 beneficiaries that were used to develop the models. On the contrary to the previous study CHAID algorithm performed better than logistic regression in predicting hypertension but C5.0 had the lowest predictive power. In addition CHAID algorithm and association rule also provided the segment characteristics for the risk factors that may be used in developing hypertension management programs. This showed that data mining approach can be a useful analytic tool for predicting and classifying health outcomes data.

  • PDF

Mapping for Biodiversity Using National Forest Inventory Data and GIS (국가 생태정보를 활용한 생물다양성 지도 구축)

  • Jung, Da-Jung;Kang, Kyung-Ho;Heo, Joon;Kim, Chang-Jae;Kim, Sung-Ho;Lee, Jung-Bin
    • Journal of Environmental Impact Assessment
    • /
    • v.19 no.6
    • /
    • pp.573-581
    • /
    • 2010
  • Natural ecosystem is an essential part to connect with the plan for biodiversity conservation in response strategy against climate change. For connecting biodiversity conservation with climate change strategy, Europe, America, Japan, and China are making an effort to discuss protection necessity through national biodiversity valuation but precedent studies lack in Korea. In this study, we made biodiversity maps representing biodiversity distribution range using species richness in National Forest Inventory (NFI) and Forest Description data. Using regression tree algorithm, we divided various classes by decision rule and constructed biodiversity maps, which has accuracy level of over 70%. Therefore, the biodiversity maps produced in this study can be used as base information for decision makers and plan for conservation of biodiversity & continuous management. Furthermore, this study can suggest a strategy for increasing efficiency of forest information in national level.

Using Predictive Analytics to Profile Potential Adopters of Autonomous Vehicles

  • Lee, Eun-Ju;Zafarzon, Nordirov;Zhang, Jing
    • Asia Marketing Journal
    • /
    • v.20 no.2
    • /
    • pp.65-83
    • /
    • 2018
  • Technological advances are bringing autonomous vehicles to the ever-evolving transportation system. Anticipating adoption of these technologies by users is essential to vehicle manufacturers for making more precise production and marketing strategies. The research investigates regulatory focus and consumer innovativeness with consumers' adoption of autonomous vehicles (AVs) and to consumers' subsequent willingness to pay for AVs. An online questionnaire was fielded to confirm predictions, and regression analysis was conducted to verify the model's validity. The results show that a promotion focus does not have a significantly positive effect on the automation level at which consumers will adopt AVs, but a prevention focus has a significantly positive effect on conditional AV adoption. Consumer innovativeness, consumers' novelty-seeking have a significantly positive relationship with high and full AV adoption, and consumers' independent decision-making has a significantly positive effect on full AV adoption. The higher the level of automation at which a consumer adopts AVs, the higher the willingness to pay for them. Finally, using a neural network and decision tree analyses, we show methods with which to describe three categories for potential adopters of AVs.

A Study on NOx Emission Control Methods in the Cement Firing Process Using Data Mining Techniques (데이터 마이닝을 이용한 시멘트 소성공정 질소산화물(NOx)배출 관리 방법에 관한 연구)

  • Park, Chul Hong;Kim, Yong Soo
    • Journal of Korean Society for Quality Management
    • /
    • v.46 no.3
    • /
    • pp.739-752
    • /
    • 2018
  • Purpose: The purpose of this study was to investigate the relationship between kiln processing parameters and NOx emissions that occur in the sintering and calcination steps of the cement manufacturing process and to derive the main factors responsible for producing emissions outside emission limit criteria, as determined by category models and classification rules, using data mining techniques. The results from this study are expected to be useful as guidelines for NOx emission control standards. Methods: Data were collected from Precalciner Kiln No.3 used in one of the domestic cement plants in Korea. Thirty-four independent variables affecting NOx generation and dependent variables that exceeded or were below the NOx emiision limit (>1 and <0, respectively) were examined during kiln processing. These data were used to construct a detection model of NOx emission, in which emissions exceeded or were below the set limits. The model was validated using SPSS MODELER 18.0, artificial neural network, decision treee (C5.0), and logistic regression analysis data mining techniques. Results: The decision tree (C5.0) algorithm best represented NOx emission behavior and was used to identify 10 processing variables that resulted in NOx emissions outside limit criteria. Conclusion: The results of this study indicate that the decision tree (C5.0) can be applied for real-time monitoring and management of NOx emissions during the cement firing process to satisfy NOx emission control standards and to provide for a more eco-friendly cement product.

Pattern Classification Model Design and Performance Comparison for Data Mining of Time Series Data (시계열 자료의 데이터마이닝을 위한 패턴분류 모델설계 및 성능비교)

  • Lee, Soo-Yong;Lee, Kyoung-Joung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.6
    • /
    • pp.730-736
    • /
    • 2011
  • In this paper, we designed the models for pattern classification which can reflect the latest trend in time series. It has been shown that fusion models based on statistical and AI methods are superior to traditional ones for the pattern classification model supporting decision making. Especially, the hit rates of pattern classification models combined with fuzzy theory are relatively increased. The statistical SVM models combined with fuzzy membership function, or the models combining neural network and FCM has shown good performance. BPN, PNN, FNN, FCM, SVM, FSVM, Decision Tree, Time Series Analysis, and Regression Analysis were used for pattern classification models in the experiments of this paper. The economical indices DB with time series properties of the financial market(Korea, KOSPI200 DB) and the electrocardiogram DB of arrhythmia patients in hospital emergencies(USA, MIT-BIH DB) were used for data base.

Assessment of compressive strength of high-performance concrete using soft computing approaches

  • Chukwuemeka Daniel;Jitendra Khatti;Kamaldeep Singh Grover
    • Computers and Concrete
    • /
    • v.33 no.1
    • /
    • pp.55-75
    • /
    • 2024
  • The present study introduces an optimum performance soft computing model for predicting the compressive strength of high-performance concrete (HPC) by comparing models based on conventional (kernel-based, covariance function-based, and tree-based), advanced machine (least square support vector machine-LSSVM and minimax probability machine regressor-MPMR), and deep (artificial neural network-ANN) learning approaches using a common database for the first time. A compressive strength database, having results of 1030 concrete samples, has been compiled from the literature and preprocessed. For the purpose of training, testing, and validation of soft computing models, 803, 101, and 101 data points have been selected arbitrarily from preprocessed data points, i.e., 1005. Thirteen performance metrics, including three new metrics, i.e., a20-index, index of agreement, and index of scatter, have been implemented for each model. The performance comparison reveals that the SVM (kernel-based), ET (tree-based), MPMR (advanced), and ANN (deep) models have achieved higher performance in predicting the compressive strength of HPC. From the overall analysis of performance, accuracy, Taylor plot, accuracy metric, regression error characteristics curve, Anderson-Darling, Wilcoxon, Uncertainty, and reliability, it has been observed that model CS4 based on the ensemble tree has been recognized as an optimum performance model with higher performance, i.e., a correlation coefficient of 0.9352, root mean square error of 5.76 MPa, and mean absolute error of 4.1069 MPa. The present study also reveals that multicollinearity affects the prediction accuracy of Gaussian process regression, decision tree, multilinear regression, and adaptive boosting regressor models, novel research in compressive strength prediction of HPC. The cosine sensitivity analysis reveals that the prediction of compressive strength of HPC is highly affected by cement content, fine aggregate, coarse aggregate, and water content.

Development of Predictive Model for Length of Stay(LOS) in Acute Stroke Patients using Artificial Intelligence (인공지능을 이용한 급성 뇌졸중 환자의 재원일수 예측모형 개발)

  • Choi, Byung Kwan;Ham, Seung Woo;Kim, Chok Hwan;Seo, Jung Sook;Park, Myung Hwa;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.16 no.1
    • /
    • pp.231-242
    • /
    • 2018
  • The efficient management of the Length of Stay(LOS) is important in hospital. It is import to reduce medical cost for patients and increase profitability for hospitals. In order to efficiently manage LOS, it is necessary to develop an artificial intelligence-based prediction model that supports hospitals in benchmarking and reduction ways of LOS. In order to develop a predictive model of LOS for acute stroke patients, acute stroke patients were extracted from 2013 and 2014 discharge injury patient data. The data for analysis was classified as 60% for training and 40% for evaluation. In the model development, we used traditional regression technique such as multiple regression analysis method, artificial intelligence technique such as interactive decision tree, neural network technique, and ensemble technique which integrate all. Model evaluation used Root ASE (Absolute error) index. They were 23.7 by multiple regression, 23.7 by interactive decision tree, 22.7 by neural network and 22.7 by esemble technique. As a result of model evaluation, neural network technique which is artificial intelligence technique was found to be superior. Through this, the utility of artificial intelligence has been proved in the development of the prediction LOS model. In the future, it is necessary to continue research on how to utilize artificial intelligence techniques more effectively in the development of LOS prediction model.