• Title/Summary/Keyword: tree based learning

Search Result 435, Processing Time 0.035 seconds

Power Consumption Forecasting Scheme for Educational Institutions Based on Analysis of Similar Time Series Data (유사 시계열 데이터 분석에 기반을 둔 교육기관의 전력 사용량 예측 기법)

  • Moon, Jihoon;Park, Jinwoong;Han, Sanghoon;Hwang, Eenjun
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.954-965
    • /
    • 2017
  • A stable power supply is very important for the maintenance and operation of the power infrastructure. Accurate power consumption prediction is therefore needed. In particular, a university campus is an institution with one of the highest power consumptions and tends to have a wide variation of electrical load depending on time and environment. For this reason, a model that can accurately predict power consumption is required for the effective operation of the power system. The disadvantage of the existing time series prediction technique is that the prediction performance is greatly degraded because the width of the prediction interval increases as the difference between the learning time and the prediction time increases. In this paper, we first classify power data with similar time series patterns considering the date, day of the week, holiday, and semester. Next, each ARIMA model is constructed based on the classified data set and a daily power consumption forecasting method of the university campus is proposed through the time series cross-validation of the predicted time. In order to evaluate the accuracy of the prediction, we confirmed the validity of the proposed method by applying performance indicators.

Bounds of PIM-based similarity measures with partially marginal proportion (부분적 주변 비율에 의한 확률적 흥미도 측도 기반 유사성 측도의 상한 및 하한의 설정)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.4
    • /
    • pp.857-864
    • /
    • 2015
  • By Wikipedia, data mining is the computational process of discovering patterns in huge data sets involving methods at the intersection of association rule, decision tree, clustering, artificial intelligence, machine learning. Clustering or cluster analysis is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. The similarity measures being used in the clustering may be classified into various types depending on the characteristics of data. In this paper, we computed bounds for similarity measures based on the probabilistic interestingness measure with partially marginal probability such as Peirce I, Peirce II, Cole I, Cole II, Loevinger, Park I, and Park II measure. We confirmed the absolute value of Loevinger measure wasthe upper limit of the absolute value of any other existing measures. Ordering of other measures is determined by the size of concurrence proportion, non-simultaneous occurrence proportion, and mismatch proportion.

Morpheme Recovery Based on Naïve Bayes Model (NB 모델을 이용한 형태소 복원)

  • Kim, Jae-Hoon;Jeon, Kil-Ho
    • The KIPS Transactions:PartB
    • /
    • v.19B no.3
    • /
    • pp.195-200
    • /
    • 2012
  • In Korean, spelling change in various forms must be recovered into base forms in morphological analysis as well as part-of-speech (POS) tagging is difficult without morphological analysis because Korean is agglutinative. This is one of notorious problems in Korean morphological analysis and has been solved by morpheme recovery rules, which generate morphological ambiguity resolved by POS tagging. In this paper, we propose a morpheme recovery scheme based on machine learning methods like Na$\ddot{i}$ve Bayes models. Input features of the models are the surrounding context of the syllable which the spelling change is occurred and categories of the models are the recovered syllables. The POS tagging system with the proposed model has demonstrated the $F_1$-score of 97.5% for the ETRI tree-tagged corpus. Thus it can be decided that the proposed model is very useful to handle morpheme recovery in Korean.

Virtual Block Game Interface based on the Hand Gesture Recognition (손 제스처 인식에 기반한 Virtual Block 게임 인터페이스)

  • Yoon, Min-Ho;Kim, Yoon-Jae;Kim, Tae-Young
    • Journal of Korea Game Society
    • /
    • v.17 no.6
    • /
    • pp.113-120
    • /
    • 2017
  • With the development of virtual reality technology, in recent years, user-friendly hand gesture interface has been more studied for natural interaction with a virtual 3D object. Most earlier studies on the hand-gesture interface are using relatively simple hand gestures. In this paper, we suggest an intuitive hand gesture interface for interaction with 3D object in the virtual reality applications. For hand gesture recognition, first of all, we preprocess various hand data and classify the data through the binary decision tree. The classified data is re-sampled and converted to the chain-code, and then constructed to the hand feature data with the histograms of the chain code. Finally, the input gesture is recognized by MCSVM-based machine learning from the feature data. To test our proposed hand gesture interface we implemented a 'Virtual Block' game. Our experiments showed about 99.2% recognition ratio of 16 kinds of command gestures and more intuitive and user friendly than conventional mouse interface.

An Analysis of Vegetation Status in an Urban Natural Park -Focus on Seoo Royal Tomb-

  • Kang, Hyun-Kyoung;Bang, Kwang-Ja;Kim, Hyun
    • Journal of the Korean Institute of Landscape Architecture International Edition
    • /
    • no.1
    • /
    • pp.13-25
    • /
    • 2001
  • Recently there have been increasing demands and desire for the urban open space due to urban development or environmental deterioration. Urban natural parks in Seoul provide citizens with comfortable open space and thus play an important role as learning spaces to experience nature and understand the environment. Accordingly, this study aims to analyze existing vegetation and provide basic data for the conservation and management plans of urban natural parks and education programs. The contents of the study encompass natural environment such as topography, altitude, slope and aspect and botanical ecosystem including the structure of plant communities and tree growth. According to the result of topography analysis, the overall altitude was not high but the slope was relative steep. Vegetation of Seoo Royal Tomb, a urban natural park has been classified into 12 types, and they include; Quercus acutissima community(lowland type), Quercus acutissima community(valley type), quercus variabilis community, Quercus mongolica community, Castanea crenata community, Capinus laxiflora community, Pinus densiflora community(lowland type), Pinus densiflora community(slope type), Robinia pseudo-acacia community, Populos$\times$albaglandulosa community, Pinus rigida community, and Pinus koraiensis community. Based on the survey and analysis results, we have classified the study area into conservation, buffer, and utilization zones for the effective management. This study provides basic data to support the establishment of master plans for urban natural parks by analyzing vegetation conditions at Seoo Royal Tomb, an urban natural park, Based on the results presented in the study, consistent monitoring work needs to be conducted, and elaborate management plans also should be prepared.

  • PDF

Machine-assisted Semi-Simulation Model (MSSM): Predicting Galactic Baryonic Properties from Their Dark Matter Using A Machine Trained on Hydrodynamic Simulations

  • Jo, Yongseok;Kim, Ji-hoon
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.2
    • /
    • pp.55.3-55.3
    • /
    • 2019
  • We present a pipeline to estimate baryonic properties of a galaxy inside a dark matter (DM) halo in DM-only simulations using a machine trained on high-resolution hydrodynamic simulations. As an example, we use the IllustrisTNG hydrodynamic simulation of a (75 h-1 Mpc)3 volume to train our machine to predict e.g., stellar mass and star formation rate in a galaxy-sized halo based purely on its DM content. An extremely randomized tree (ERT) algorithm is used together with multiple novel improvements we introduce here such as a refined error function in machine training and two-stage learning. Aided by these improvements, our model demonstrates a significantly increased accuracy in predicting baryonic properties compared to prior attempts --- in other words, the machine better mimics IllustrisTNG's galaxy-halo correlation. By applying our machine to the MultiDark-Planck DM-only simulation of a large (1 h-1 Gpc)3 volume, we then validate the pipeline that rapidly generates a galaxy catalogue from a DM halo catalogue using the correlations the machine found in IllustrisTNG. We also compare our galaxy catalogue with the ones produced by popular semi-analytic models (SAMs). Our so-called machine-assisted semi-simulation model (MSSM) is shown to be largely compatible with SAMs, and may become a promising method to transplant the baryon physics of galaxy-scale hydrodynamic calculations onto a larger-volume DM-only run. We discuss the benefits that machine-based approaches like this entail, as well as suggestions to raise the scientific potential of such approaches.

  • PDF

Real-time Fall Accident Prediction using Random Forest in IoT Environment (사물인터넷 환경에서 랜덤포레스트를 이용한 실시간 낙상 사고 예측)

  • Chan-Woo Bang;Bong-Hyun Kim
    • Journal of Internet of Things and Convergence
    • /
    • v.10 no.4
    • /
    • pp.27-33
    • /
    • 2024
  • As of 2023, the number of accident victims in the domestic construction industry is 26,829, ranking second only to other businesses (service industries). The accident types of casualties in all industries were falls (29,229 people), followed by falls (14,357 people). Based on the above data, this study attaches sensors to hard hats and insoles to predict fall accidents that frequently occur at construction sites, and proposes smart safety equipment that applies a random forest algorithm based on the data collected through this. The random forest model can determine fall accidents in real time with high accuracy by generating multiple decision trees and combining the predictions of each tree. This model classifies whether a worker has had a fall accident and the type of behavior through data collected from the MPU-6050 sensor attached to the hard hat. Fall accidents that are primarily determined from hard hats are secondarily predicted through sensors attached to the insole, thereby increasing prediction accuracy. It is expected that this will enable rapid response in the event of an accident, thereby reducing worker deaths and accidents.

Experimental Comparison of Network Intrusion Detection Models Solving Imbalanced Data Problem (데이터의 불균형성을 제거한 네트워크 침입 탐지 모델 비교 분석)

  • Lee, Jong-Hwa;Bang, Jiwon;Kim, Jong-Wouk;Choi, Mi-Jung
    • KNOM Review
    • /
    • v.23 no.2
    • /
    • pp.18-28
    • /
    • 2020
  • With the development of the virtual community, the benefits that IT technology provides to people in fields such as healthcare, industry, communication, and culture are increasing, and the quality of life is also improving. Accordingly, there are various malicious attacks targeting the developed network environment. Firewalls and intrusion detection systems exist to detect these attacks in advance, but there is a limit to detecting malicious attacks that are evolving day by day. In order to solve this problem, intrusion detection research using machine learning is being actively conducted, but false positives and false negatives are occurring due to imbalance of the learning dataset. In this paper, a Random Oversampling method is used to solve the unbalance problem of the UNSW-NB15 dataset used for network intrusion detection. And through experiments, we compared and analyzed the accuracy, precision, recall, F1-score, training and prediction time, and hardware resource consumption of the models. Based on this study using the Random Oversampling method, we develop a more efficient network intrusion detection model study using other methods and high-performance models that can solve the unbalanced data problem.

Data analysis by Integrating statistics and visualization: Visual verification for the prediction model (통계와 시각화를 결합한 데이터 분석: 예측모형 대한 시각화 검증)

  • Mun, Seong Min;Lee, Kyung Won
    • Design Convergence Study
    • /
    • v.15 no.6
    • /
    • pp.195-214
    • /
    • 2016
  • Predictive analysis is based on a probabilistic learning algorithm called pattern recognition or machine learning. Therefore, if users want to extract more information from the data, they are required high statistical knowledge. In addition, it is difficult to find out data pattern and characteristics of the data. This study conducted statistical data analyses and visual data analyses to supplement prediction analysis's weakness. Through this study, we could find some implications that haven't been found in the previous studies. First, we could find data pattern when adjust data selection according as splitting criteria for the decision tree method. Second, we could find what type of data included in the final prediction model. We found some implications that haven't been found in the previous studies from the results of statistical and visual analyses. In statistical analysis we found relation among the multivariable and deducted prediction model to predict high box office performance. In visualization analysis we proposed visual analysis method with various interactive functions. Finally through this study we verified final prediction model and suggested analysis method extract variety of information from the data.

Prediction Model for unfavorable Outcome in Spontaneous Intracerebral Hemorrhage Based on Machine Learning

  • Shengli Li;Jianan Zhang;Xiaoqun Hou;Yongyi Wang;Tong Li;Zhiming Xu;Feng Chen;Yong Zhou;Weimin Wang;Mingxing Liu
    • Journal of Korean Neurosurgical Society
    • /
    • v.67 no.1
    • /
    • pp.94-102
    • /
    • 2024
  • Objective : The spontaneous intracerebral hemorrhage (ICH) remains a significant cause of mortality and morbidity throughout the world. The purpose of this retrospective study is to develop multiple models for predicting ICH outcomes using machine learning (ML). Methods : Between January 2014 and October 2021, we included ICH patients identified by computed tomography or magnetic resonance imaging and treated with surgery. At the 6-month check-up, outcomes were assessed using the modified Rankin Scale. In this study, four ML models, including Support Vector Machine (SVM), Decision Tree C5.0, Artificial Neural Network, Logistic Regression were used to build ICH prediction models. In order to evaluate the reliability and the ML models, we calculated the area under the receiver operating characteristic curve (AUC), specificity, sensitivity, accuracy, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR). Results : We identified 71 patients who had favorable outcomes and 156 who had unfavorable outcomes. The results showed that the SVM model achieved the best comprehensive prediction efficiency. For the SVM model, the AUC, accuracy, specificity, sensitivity, PLR, NLR, and DOR were 0.91, 0.92, 0.92, 0.93, 11.63, 0.076, and 153.03, respectively. For the SVM model, we found the importance value of time to operating room (TOR) was higher significantly than other variables. Conclusion : The analysis of clinical reliability showed that the SVM model achieved the best comprehensive prediction efficiency and the importance value of TOR was higher significantly than other variables.