• Title/Summary/Keyword: tree based learning

Search Result 435, Processing Time 0.031 seconds

URL Filtering by Using Machine Learning

  • Saqib, Malik Najmus
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.275-279
    • /
    • 2022
  • The growth of technology nowadays has made many things easy for humans. These things are from everyday small task to more complex tasks. Such growth also comes with the illegal activities that are perform by using technology. These illegal activities can simple as displaying annoying message to big frauds. The easiest way for the attacker to perform such activities is to convenience user to click on the malicious link. It has been a great concern since a decay to classify URLs as malicious or benign. The blacklist has been used initially for that purpose and is it being used nowadays. It is efficient but has a drawback to update blacklist automatically. So, this method is replace by classification of URLs based on machine learning algorithms. In this paper we have use four machine learning classification algorithms to classify URLs as malicious or benign. These algorithms are support vector machine, random forest, n-nearest neighbor, and decision tree. The dataset that is used in this research has 36694 instances. A comparison of precision accuracy and recall values are shown for dataset with and without preprocessing.

A Study on Deep Learning model for classifying programs by functionalities (기능성에 따른 프로그래밍 소스코드 분류를 위한 Deep Learning Model 연구)

  • Yoon, Joo-Sung;Lee, Eun-Hun;An, Jin-Hyeon;Kim, Hyun-Cheol
    • Annual Conference of KIPS
    • /
    • 2016.10a
    • /
    • pp.615-616
    • /
    • 2016
  • 최근 4차 산업으로 패러다임이 변화함에 따라 SW산업이 더욱 중요하게 되었다. 이에 따라 전 세계적으로 코딩 교육에 대한 수요도 증가하게 되었고 기업에서도 SW를 잘 만들기 위한 코드 관리 중요성도 증가하게 되었다. 많은 양의 프로그래밍 소스코드를 사람이 일일이 채점하고 관리하는 것은 사실상 불가능하기 때문에 이러한 문제를 해결할 수 있는 코드 평가 시스템이 요구되고 있다. 하지만 어떤 코드가 좋은 코드인지 코드를 어떻게 평가해야하는지에 대한 명확한 기준은 없으며 이에 대한 연구도 부족한 상황이다. 최근에 주목 받고 있는 Deep Learning 기술은 이미지 처리, 자연어 처리등 기존의 Machine Learning 알고리즘이 냈던 성과보다 훨씬 뛰어난 성과를 내고 있다. 하지만 Programming language 영역에서는 아직 깊이 연구된 바가 없다. 따라서 본 연구에서는 Deep Learning 기술로 알려진 Convolutional Neural Network의 변형된 형태엔 Tree-based Convolutional Neural Network를 사용하여 프로그래밍 소스코드를 분석, 분류하는 알고리즘 및 코드의 Representation Learning에 대한 연구를 진행함으로써 이러한 문제를 해결하고자 한다.

A study on Natural Disaster Prediction Using Multi-Class Decision Forest

  • Eom, Tae-Hyuk;Kim, Kyung-A
    • Korean Journal of Artificial Intelligence
    • /
    • v.10 no.1
    • /
    • pp.1-7
    • /
    • 2022
  • In this paper, a study was conducted to predict natural disasters in Afghanistan based on machine learning. Natural disasters need to be prepared not only in Korea but also in other vulnerable countries. Every year in Afghanistan, natural disasters(snow, earthquake, drought, flood) cause property and casualties. We decided to conduct research on this phenomenon because we thought that the damage would be small if we were to prepare for it. The Azure Machine Learning Studio used in the study has the advantage of being more visible and easier to use than other Machine Learning tools. Decision Forest is a model for classifying into decision tree types. Decision forest enables intuitive analysis as a model that is easy to analyze results and presents key variables and separation criteria. Also, since it is a nonparametric model, it is free to assume (normality, independence, equal dispersion) required by the statistical model. Finally, linear/non-linear relationships can be searched considering interactions between variables. Therefore, the study used decision forest. The study found that overall accuracy was 89 percent and average accuracy was 97 percent. Although the results of the experiment showed a little high accuracy, items with low natural disaster frequency were less accurate due to lack of learning. By learning and complementing more data, overall accuracy can be improved, and damage can be reduced by predicting natural disasters.

A Qualitative Study on Adult Learners' Learning Experience Typology in Humanities & General Education (성인학습자의 인문교양교육 학습경험 유형화에 관한 질적 연구)

  • Kim, Mi-Jeong;Lee, jung-Hee;Ahn, Young-Sik
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.25 no.2
    • /
    • pp.510-525
    • /
    • 2013
  • The purpose of this study is to investigate adult learners' experience by studying Humanities & General Education and get to know types and characteristics by classifying their learning experiences. This study uses grounded theory method which is suitable to investigate subjective experiences. In this study, data is collected from 13 adult learners by using Focus Group Interview(FGI) who participate in learning experience of Humanities & General Education of D university in Busan region. The data is categorized by open coding, axial coding and selective coding based on data analysis method of grounded theory and analysis processes. This study provides several outcomes as follows: 113 concepts, 38 subcategories and 16 upper categories are derived through the process of abbreviation and categorization of learning experience of Humanities & General Education. In a process of learning experience, this study shows interrelationship in a frame of paradigm and derives results of a process of abbreviation and categorization casual condition, contextual condition, phenomenon and interaction(help/obstruction factor). Tree types of learning experiences and characteristics are drawn as follows: 1) "Self-realization" is the type who participate in Humanities & General Education with desire of learning and they want to find identity and plan detailed future. 2) "The pursuit of happiness" has less desire on learning than "self-realization" and they are types who participate in Humanities & General Education because of someone else's help and suggestion. 3) "Local community" is the type who participate in Humanities & General Education because they feel necessity of social role and they expect local development based on their interest in local community. Several conclusions and suggestions are provided for further studies.

Intelligent On-demand Routing Protocol for Ad Hoc Network

  • Ye, Yongfei;Sun, Xinghua;Liu, Minghe;Mi, Jing;Yan, Ting;Ding, Lihua
    • Journal of Information Processing Systems
    • /
    • v.16 no.5
    • /
    • pp.1113-1128
    • /
    • 2020
  • Ad hoc networks play an important role in mobile communications, and the performance of nodes has a significant impact on the choice of communication links. To ensure efficient and secure data forwarding and delivery, an intelligent routing protocol (IAODV) based on learning method is constructed. Five attributes of node energy, rate, credit value, computing power and transmission distance are taken as the basis of segmentation. By learning the selected samples and calculating the information gain of each attribute, the decision tree of routing node is constructed, and the rules of routing node selection are determined. IAODV algorithm realizes the adaptive evaluation and classification of network nodes, so as to determine the optimal transmission path from the source node to the destination node. The simulation results verify the feasibility, effectiveness and security of IAODV.

Estimation of lightweight aggregate concrete characteristics using a novel stacking ensemble approach

  • Kaloop, Mosbeh R.;Bardhan, Abidhan;Hu, Jong Wan;Abd-Elrahman, Mohamed
    • Advances in nano research
    • /
    • v.13 no.5
    • /
    • pp.499-512
    • /
    • 2022
  • This study investigates the efficiency of ensemble machine learning for predicting the lightweight-aggregate concrete (LWC) characteristics. A stacking ensemble (STEN) approach was proposed to estimate the dry density (DD) and 28 days compressive strength (Fc-28) of LWC using two meta-models called random forest regressor (RFR) and extra tree regressor (ETR), and two novel ensemble models called STEN-RFR and STEN-ETR, were constructed. Four standalone machine learning models including artificial neural network, gradient boosting regression, K neighbor regression, and support vector regression were used to compare the performance of the proposed models. For this purpose, a sum of 140 LWC mixtures with 21 influencing parameters for producing LWC with a density less than 1000 kg/m3, were used. Based on the experimental results with multiple performance criteria, it can be concluded that the proposed STEN-ETR model can be used to estimate the DD and Fc-28 of LWC. Moreover, the STEN-ETR approach was found to be a significant technique in prediction DD and Fc-28 of LWC with minimal prediction error. In the validation phase, the accuracy of the proposed STEN-ETR model in predicting DD and Fc-28 was found to be 96.79% and 81.50%, respectively. In addition, the significance of cement, water-cement ratio, silica fume, and aggregate with expanded glass variables is efficient in modeling DD and Fc-28 of LWC.

Performance Improvement of Genetic Programming Based on Reinforcement Learning (강화학습에 의한 유전자 프로그래밍의 성능 개선)

  • 전효병;이동욱;심귀보
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.3
    • /
    • pp.1-8
    • /
    • 1998
  • This paper proposes a reinforcement genetic programming based on the reinforcement learning method for the performance improvement of genetic programming. Genetic programming which has tree structure program has much flexibility of problem expression because it has no limitation in the size of chromosome compared to the other evolutionary algorithms. But worse results on the point of convergence associated with mutation and crossover operations are often due to this characteristic. Therefore the sizes of population and maximum generation are typically larger than those of the other evolutionary algorithms. This paper proposes a new method that executes crossover and mutation operations based on reinforcement and inhibition mechanism of reinforcement learning. The validity of the proposed method is evaluated by appling it to the artificial ant problem.

  • PDF

Performance Evaluation of Stacking Models Based on Random Forest, XGBoost, and LGBM for Wind Power Forecasting (Random Forest, XGBoost, LGBM 조합형 Stacking 모델을 이용한 풍력 발전량 예측 성능 평가)

  • Hui-Chan Kim;Dae-Young Kim;Bum-Suk Kim
    • Journal of Wind Energy
    • /
    • v.15 no.3
    • /
    • pp.21-29
    • /
    • 2024
  • Wind power is highly variable due to the intermittent nature of wind. This can lead to power grid instability and decreased efficiency. Therefore, it is necessary to improve wind power prediction performance to minimize the negative impact on the power system. Recently, wind power prediction using machine learning has gained popularity, and ensemble models in machine learning have shown high prediction accuracy. RF, GB, XGB and LGBM are decision tree-based ensemble models and have high predictive performance in wind power, but these models have problems from over-fitting and strong dependence on certain variables. However, the stacking model can improve prediction performance by combining individual models and compensate for the shortcomings of each model. In this study, The MAE of RF, XGB and LGBM is 310.42 kWh, 217.07 kWh and 265.20 kWh, respectively, while the stacking model based on RF, XGB and LGBM is 202.33 kWh. Stacking models can improve prediction performance. Finally, it is expected to contribute to electricity supply and demand planning.

A Machine Learning Approach to Web Image Classification (기계학습 기반의 웹 이미지 분류)

  • Cho, Soo-Sun;Lee, Dong-Woo;Han, Dong-Won;Hwang, Chi-Jung
    • The KIPS Transactions:PartB
    • /
    • v.9B no.6
    • /
    • pp.759-764
    • /
    • 2002
  • Although image occupies a large part of importance on the Web documents, there have not been many researches for analyzing and understanding it. Many Web images are used for carrying important information but others are not used for it. In this paper classify the Web images from presently served Web sites to erasable or non-erasable classes. based on machine learning methods. For this research, we have detected 16 special and rich features for Web images and experimented by using the Baysian and decision tree methods. As the results, F-measures of 87.09%, 82.72% were achived for each method and particularly, from the experiments to compare the effects of feature groups, it has proved that the added features on this study are very useful for Web image classification.

Comparison of the Machine Learning Models Predicting Lithium-ion Battery Capacity for Remaining Useful Life Estimation (리튬이온 배터리 수명추정을 위한 용량예측 머신러닝 모델의 성능 비교)

  • Yoo, Sangwoo;Shin, Yongbeom;Shin, Dongil
    • Journal of the Korean Institute of Gas
    • /
    • v.24 no.6
    • /
    • pp.91-97
    • /
    • 2020
  • Lithium-ion batteries (LIBs) have a longer lifespan, higher energy density, and lower self-discharge rates than other batteries, therefore, they are preferred as an Energy Storage System (ESS). However, during years 2017-2019, 28 ESS fire accidents occurred in Korea, and accurate capacity estimation of LIB is essential to ensure safety and reliability during operations. In this study, data-driven modeling that predicts capacity changes according to the charging cycle of LIB was conducted, and developed models were compared their performance for the selection of the optimal machine learning model, which includes the Decision Tree, Ensemble Learning Method, Support Vector Regression, and Gaussian Process Regression (GPR). For model training, lithium battery test data provided by NASA was used, and GPR showed the best prediction performance. Based on this study, we will develop an enhanced LIB capacity prediction and remaining useful life estimation model through additional data training, and improve the performance of anomaly detection and monitoring during operations, enabling safe and stable ESS operations.