• Title/Summary/Keyword: Decision forest

Search Result 429, Processing Time 0.026 seconds

Comparative Analysis of Effective Algorithm Techniques for the Detection of Syn Flooding Attacks (Syn Flooding 탐지를 위한 효과적인 알고리즘 기법 비교 분석)

  • Jong-Min Kim;Hong-Ki Kim;Joon-Hyung Lee
    • Convergence Security Journal
    • /
    • v.23 no.5
    • /
    • pp.73-79
    • /
    • 2023
  • Cyber threats are evolving and becoming more sophisticated with the development of new technologies, and consequently the number of service failures caused by DDoS attacks are continually increasing. Recently, DDoS attacks have numerous types of service failures by applying a large amount of traffic to the domain address of a specific service or server. In this paper, after generating the data of the Syn Flooding attack, which is the representative attack type of bandwidth exhaustion attack, the data were compared and analyzed using Random Forest, Decision Tree, Multi-Layer Perceptron, and KNN algorithms for the effective detection of attacks, and the optimal algorithm was derived. Based on this result, it will be useful to use as a technique for the detection policy of Syn Flooding attacks.

Machine learning-based Predictive Model of Suicidal Thoughts among Korean Adolescents. (머신러닝 기반 한국 청소년의 자살 생각 예측 모델)

  • YeaJu JIN;HyunKi KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.1-6
    • /
    • 2023
  • This study developed models using decision forest, support vector machine, and logistic regression methods to predict and prevent suicidal ideation among Korean adolescents. The study sample consisted of 51,407 individuals after removing missing data from the raw data of the 18th (2022) Youth Health Behavior Survey conducted by the Korea Centers for Disease Control and Prevention. Analysis was performed using the MS Azure program with Two-Class Decision Forest, Two-Class Support Vector Machine, and Two-Class Logistic Regression. The results of the study showed that the decision forest model achieved an accuracy of 84.8% and an F1-score of 36.7%. The support vector machine model achieved an accuracy of 86.3% and an F1-score of 24.5%. The logistic regression model achieved an accuracy of 87.2% and an F1-score of 40.1%. Applying the logistic regression model with SMOTE to address data imbalance resulted in an accuracy of 81.7% and an F1-score of 57.7%. Although the accuracy slightly decreased, the recall, precision, and F1-score improved, demonstrating excellent performance. These findings have significant implications for the development of prediction models for suicidal ideation among Korean adolescents and can contribute to the prevention and improvement of youth suicide.

Using Mechanical Learning Analysis of Determinants of Housing Sales and Establishment of Forecasting Model (기계학습을 활용한 주택매도 결정요인 분석 및 예측모델 구축)

  • Kim, Eun-mi;Kim, Sang-Bong;Cho, Eun-seo
    • Journal of Cadastre & Land InformatiX
    • /
    • v.50 no.1
    • /
    • pp.181-200
    • /
    • 2020
  • This study used the OLS model to estimate the determinants affecting the tenure of a home and then compared the predictive power of each model with SVM, Decision Tree, Random Forest, Gradient Boosting, XGBooest and LightGBM. There is a difference from the preceding study in that the Stacking model, one of the ensemble models, can be used as a base model to establish a more predictable model to identify the volume of housing transactions in the housing market. OLS analysis showed that sales profits, housing prices, the number of household members, and the type of residential housing (detached housing, apartments) affected the period of housing ownership, and compared the predictability of the machine learning model with RMSE, the results showed that the machine learning model had higher predictability. Afterwards, the predictive power was compared by applying each machine learning after rebuilding the data with the influencing variables, and the analysis showed the best predictive power of Random Forest. In addition, the most predictable Random Forest, Decision Tree, Gradient Boosting, and XGBooost models were applied as individual models, and the Stacking model was constructed using Linear, Ridge, and Lasso models as meta models. As a result of the analysis, the RMSE value in the Ridge model was the lowest at 0.5181, thus building the highest predictive model.

A Study on Developing an Optimization Model for Particleboard Manufacturing Processes (파티클보드 제조공정(製造工程)의 최적화(最適化) 모델개발에 관한 연구(硏究))

  • Chung, Joo Sang;Park, Hee Jun;Lee, Phil Woo
    • Journal of Korean Society of Forest Science
    • /
    • v.82 no.4
    • /
    • pp.396-405
    • /
    • 1993
  • In this paper, a nonlinear programming model to determine the optimal operating policy to minimize production costs for particleboard plants is presented. The model provides optimal values for three decision variables : specific gravity of particleboard, mat moisture content and mat resin content. These decision variables are key factors influencing the cost and quality of particleboard manufacturing processes. In formulating the nonlinear programming model, the minimum quality standards for internal bond strength and modulus of rupture of particleboard are used as industry-wide quality constraints. These quality standards are expressed as nonlinear functions of the decision variables. In order to demonstrate the applicability of the proposed model, the model is applied to solve for optimal solutions of four theoretical problems. The problem scenarios are built to investigate effects of changes in hot-pressing speed and purchase price of chip and resin.

  • PDF

A research on the key factors for classification of diabetes based on random forest

  • Shin, Yong sub;Lee, Namju;Hwang, Chigon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.3
    • /
    • pp.102-107
    • /
    • 2020
  • Recently, the number of people visiting the hospital is increasing due to diabetes. According to the Korean Diabetes Association, statistically, 1 in 7 adults over the age of 30 are suffering from diabetes. As such, diabetes is one of the most common diseases among modern people. In this paper, in addition to blood sugar, which is widely used for diabetes awareness, BMI, which is known to be related to diabetes, triglycerides and cholesterol that cause various complications in diabetics it was studied using random forest techniques and decision trees known to be effective for classification. The importance of each element was confirmed using the results and characteristic importance derived using two techniques. Through this, we studied the diabetes-related relationship between BMI, triglyceride, and cholesterol as well as blood sugar, a factor that diabetic patients should pay much attention to.

Tree size determination for classification ensemble

  • Choi, Sung Hoon;Kim, Hyunjoong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.1
    • /
    • pp.255-264
    • /
    • 2016
  • Classification is a predictive modeling for a categorical target variable. Various classification ensemble methods, which predict with better accuracy by combining multiple classifiers, became a powerful machine learning and data mining paradigm. Well-known methodologies of classification ensemble are boosting, bagging and random forest. In this article, we assume that decision trees are used as classifiers in the ensemble. Further, we hypothesized that tree size affects classification accuracy. To study how the tree size in uences accuracy, we performed experiments using twenty-eight data sets. Then we compare the performances of ensemble algorithms; bagging, double-bagging, boosting and random forest, with different tree sizes in the experiment.

Default Prediction of Automobile Credit Based on Support Vector Machine

  • Chen, Ying;Zhang, Ruirui
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.75-88
    • /
    • 2021
  • Automobile credit business has developed rapidly in recent years, and corresponding default phenomena occur frequently. Credit default will bring great losses to automobile financial institutions. Therefore, the successful prediction of automobile credit default is of great significance. Firstly, the missing values are deleted, then the random forest is used for feature selection, and then the sample data are randomly grouped. Finally, six prediction models of support vector machine (SVM), random forest and k-nearest neighbor (KNN), logistic, decision tree, and artificial neural network (ANN) are constructed. The results show that these six machine learning models can be used to predict the default of automobile credit. Among these six models, the accuracy of decision tree is 0.79, which is the highest, but the comprehensive performance of SVM is the best. And random grouping can improve the efficiency of model operation to a certain extent, especially SVM.

A Real-Time Sound Recognition System with a Decision Logic of Random Forest for Robots (Random Forest를 결정로직으로 활용한 로봇의 실시간 음향인식 시스템 개발)

  • Song, Ju-man;Kim, Changmin;Kim, Minook;Park, Yongjin;Lee, Seoyoung;Son, Jungkwan
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.273-281
    • /
    • 2022
  • In this paper, we propose a robot sound recognition system that detects various sound events. The proposed system is designed to detect various sound events in real-time by using a microphone on a robot. To get real-time performance, we use a VGG11 model which includes several convolutional neural networks with real-time normalization scheme. The VGG11 model is trained on augmented DB through 24 kinds of various environments (12 reverberation times and 2 signal to noise ratios). Additionally, based on random forest algorithm, a decision logic is also designed to generate event signals for robot applications. This logic can be used for specific classes of acoustic events with better performance than just using outputs of network model. With some experimental results, the performance of proposed sound recognition system is shown on real-time device for robots.

Axial load prediction in double-skinned profiled steel composite walls using machine learning

  • G., Muthumari G;P. Vincent
    • Computers and Concrete
    • /
    • v.33 no.6
    • /
    • pp.739-754
    • /
    • 2024
  • This study presents an innovative AI-driven approach to assess the ultimate axial load in Double-Skinned Profiled Steel sheet Composite Walls (DPSCWs). Utilizing a dataset of 80 entries, seven input parameters were employed, and various AI techniques, including Linear Regression, Polynomial Regression, Support Vector Regression, Decision Tree Regression, Decision Tree with AdaBoost Regression, Random Forest Regression, Gradient Boost Regression Tree, Elastic Net Regression, Ridge Regression, and LASSO Regression, were evaluated. Decision Tree Regression and Random Forest Regression emerged as the most accurate models. The top three performing models were integrated into a hybrid approach, excelling in accurately estimating DPSCWs' ultimate axial load. This adaptable hybrid model outperforms traditional methods, reducing errors in complex scenarios. The validated Artificial Neural Network (ANN) model showcases less than 1% error, enhancing reliability. Correlation analysis highlights robust predictions, emphasizing the importance of steel sheet thickness. The study contributes insights for predicting DPSCW strength in civil engineering, suggesting optimization and database expansion. The research advances precise load capacity estimation, empowering engineers to enhance construction safety and explore further machine learning applications in structural engineering.

Development on Prediction Algorithm of Sediment Discharge by Debris Flow for Decision of Location and Scale of the Check Dam (사방댐 위치 및 규모 결정을 위한 토석류 토사유출량 예측 알고리즘 개발)

  • Kim, Kidae;Woo, Choongshik;Lee, Changwoo;Seo, Junpyo;Kang, Minjeng
    • Journal of the Society of Disaster Information
    • /
    • v.16 no.3
    • /
    • pp.586-593
    • /
    • 2020
  • Purpose: This study aims to develop an algorithm for predicting sediment discharge by debris flow, and develop GIS-based decision support system for optimal arrangement of check dam. Method: The average stream width and flow length were used to predict the cumulative sediment discharge by debris flow. At this time, the amount of slope failure on source area and average flow length were utilized as input factors. Result: The predicted sediment discharge calculated through the algorithm was 1.1 times different on average compared to the actual sediment discharge by debris flow. In addition, the program is an objective indicator that selects the location and size of the check dam, and it can help practitioners make rational decisions. Conclusion: The soil erosion control works are being implemented every year. Therefore, it is expected that the GIS-based decision support system for location and size of the check dam will contribute to the prevention of sediment-related disasters.