• Title/Summary/Keyword: machine learning models

Search Result 1,361, Processing Time 0.034 seconds

A Method for Correcting Air-Pressure Data Collected by Mini-AWS (소형 자동기상관측장비(Mini-AWS) 기압자료 보정 기법)

  • Ha, Ji-Hun;Kim, Yong-Hyuk;Im, Hyo-Hyuc;Choi, Deokwhan;Lee, Yong Hee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.3
    • /
    • pp.182-189
    • /
    • 2016
  • For high accuracy of forecast using numerical weather prediction models, we need to get weather observation data that are large and high dense. Korea Meteorological Administration (KMA) mantains Automatic Weather Stations (AWSs) to get weather observation data, but their installation and maintenance costs are high. Mini-AWS is a very compact automatic weather station that can measure and record temperature, humidity, and pressure. In contrast to AWS, costs of Mini-AWS's installation and maintenance are low. It also has a little space restraints for installing. So it is easier than AWS to install mini-AWS on places where we want to get weather observation data. But we cannot use the data observed from Mini-AWSs directly, because it can be affected by surrounding. In this paper, we suggest a correcting method for using pressure data observed from Mini-AWS as weather observation data. We carried out preconditioning process on pressure data from Mini-AWS. Then they were corrected by using machine learning methods with the aim of adjusting to pressure data of the AWS closest to them. Our experimental results showed that corrected pressure data are in regulation and our correcting method using SVR showed very good performance.

A Method of Detecting the Aggressive Driving of Elderly Driver (노인 운전자의 공격적인 운전 상태 검출 기법)

  • Koh, Dong-Woo;Kang, Hang-Bong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.537-542
    • /
    • 2017
  • Aggressive driving is a major cause of car accidents. Previous studies have mainly analyzed young driver's aggressive driving tendency, yet they were only done through pure clustering or classification technique of machine learning. However, since elderly people have different driving habits due to their fragile physical conditions, it is necessary to develop a new method such as enhancing the characteristics of driving data to properly analyze aggressive driving of elderly drivers. In this study, acceleration data collected from a smartphone of a driving vehicle is analyzed by a newly proposed ECA(Enhanced Clustering method for Acceleration data) technique, coupled with a conventional clustering technique (K-means Clustering, Expectation-maximization algorithm). ECA selects high-intensity data among the data of the cluster group detected through K-means and EM in all of the subjects' data and models the characteristic data through the scaled value. Using this method, the aggressive driving data of all youth and elderly experiment participants were collected, unlike the pure clustering method. We further found that the K-means clustering has higher detection efficiency than EM method. Also, the results of K-means clustering demonstrate that a young driver has a driving strength 1.29 times higher than that of an elderly driver. In conclusion, the proposed method of our research is able to detect aggressive driving maneuvers from data of the elderly having low operating intensity. The proposed method is able to construct a customized safe driving system for the elderly driver. In the future, it will be possible to detect abnormal driving conditions and to use the collected data for early warning to drivers.

Hourly Prediction of Particulate Matter (PM2.5) Concentration Using Time Series Data and Random Forest (시계열 데이터와 랜덤 포레스트를 활용한 시간당 초미세먼지 농도 예측)

  • Lee, Deukwoo;Lee, Soowon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.4
    • /
    • pp.129-136
    • /
    • 2020
  • PM2.5 which is a very tiny air particulate matter even smaller than PM10 has been issued in the environmental problem. Since PM2.5 can cause eye diseases or respiratory problems and infiltrate even deep blood vessels in the brain, it is important to predict PM2.5. However, it is difficult to predict PM2.5 because there is no clear explanation yet regarding the creation and the movement of PM2.5. Thus, prediction methods which not only predict PM2.5 accurately but also have the interpretability of the result are needed. To predict hourly PM2.5 of Seoul city, we propose a method using random forest with the adjusted bootstrap number from the time series ground data preprocessed on different sources. With this method, the prediction model can be trained uniformly on hourly information and the result has the interpretability. To evaluate the prediction performance, we conducted comparative experiments. As a result, the performance of the proposed method was superior against other models in all labels. Also, the proposed method showed the importance of the variables regarding the creation of PM2.5 and the effect of China.

A point-scale gap filling of the flux-tower data using the artificial neural network (인공신경망 기법을 이용한 청미천 유역 Flux tower 결측치 보정)

  • Jeon, Hyunho;Baik, Jongjin;Lee, Seulchan;Choi, Minha
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.11
    • /
    • pp.929-938
    • /
    • 2020
  • In this study, we estimated missing evapotranspiration (ET) data at a eddy-covariance flux tower in the Cheongmicheon farmland site using the Artificial Neural Network (ANN). The ANN showed excellent performance in numerical analysis and is expanding in various fields. To evaluate the performance the ANN-based gap-filling, ET was calculated using the existing gap-filling methods of Mean Diagnostic Variation (MDV) and Food and Aggregation Organization Penman-Monteith (FAO-PM). Then ET was evaluated by time series method and statistical analysis (coefficient of determination, index of agreement (IOA), root mean squared error (RMSE) and mean absolute error (MAE). For the validation of each gap-filling model, we used 30 minutes of data in 2015. Of the 121 missing values, the ANN method showed the best performance by supplementing 70, 53 and 84 missing values, respectively, in the order of MDV, FAO-PM, and ANN methods. Analysis of the coefficient of determination (MDV, FAO-PM, and ANN methods followed by 0.673, 0.784, and 0.841, respectively.) and the IOA (The MDV, FAO-PM, and ANN methods followed by 0.899, 0.890, and 0.951 respectively.) indicated that, all three methods were highly correlated and considered to be fully utilized, and among them, ANN models showed the highest performance and suitability. Based on this study, it could be used more appropriately in the study of gap-filling method of flux tower data using machine learning method.

Design of Data-centroid Radial Basis Function Neural Network with Extended Polynomial Type and Its Optimization (데이터 중심 다항식 확장형 RBF 신경회로망의 설계 및 최적화)

  • Oh, Sung-Kwun;Kim, Young-Hoon;Park, Ho-Sung;Kim, Jeong-Tae
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.3
    • /
    • pp.639-647
    • /
    • 2011
  • In this paper, we introduce a design methodology of data-centroid Radial Basis Function neural networks with extended polynomial function. The two underlying design mechanisms of such networks involve K-means clustering method and Particle Swarm Optimization(PSO). The proposed algorithm is based on K-means clustering method for efficient processing of data and the optimization of model was carried out using PSO. In this paper, as the connection weight of RBF neural networks, we are able to use four types of polynomials such as simplified, linear, quadratic, and modified quadratic. Using K-means clustering, the center values of Gaussian function as activation function are selected. And the PSO-based RBF neural networks results in a structurally optimized structure and comes with a higher level of flexibility than the one encountered in the conventional RBF neural networks. The PSO-based design procedure being applied at each node of RBF neural networks leads to the selection of preferred parameters with specific local characteristics (such as the number of input variables, a specific set of input variables, and the distribution constant value in activation function) available within the RBF neural networks. To evaluate the performance of the proposed data-centroid RBF neural network with extended polynomial function, the model is experimented with using the nonlinear process data(2-Dimensional synthetic data and Mackey-Glass time series process data) and the Machine Learning dataset(NOx emission process data in gas turbine plant, Automobile Miles per Gallon(MPG) data, and Boston housing data). For the characteristic analysis of the given entire dataset with non-linearity as well as the efficient construction and evaluation of the dynamic network model, the partition of the given entire dataset distinguishes between two cases of Division I(training dataset and testing dataset) and Division II(training dataset, validation dataset, and testing dataset). A comparative analysis shows that the proposed RBF neural networks produces model with higher accuracy as well as more superb predictive capability than other intelligent models presented previously.

A Study on the Estimation of the Threshold Rainfall in Standard Watershed Units (표준유역단위 한계강우량 산정에 관한 연구)

  • Choo, Kyung-Su;Kang, Dong-Ho;Kim, Byung-Sik
    • Journal of Korean Society of Disaster and Security
    • /
    • v.14 no.2
    • /
    • pp.1-11
    • /
    • 2021
  • Recently, in Korea, the risk of meteorological disasters is increasing due to climate change, and the damage caused by rainfall is being emphasized continuously. Although the current weather forecast provides quantitative rainfall, there are several difficulties in predicting the extent of damage. Therefore, in order to understand the impact of damage, the threshold rainfall for each watershed is required. The damage caused by rainfall occurs differently by region, and there are limitations in the analysis considering the characteristic factors of each watershed. In addition, whenever rainfall comes, the analysis of rainfall-runoff through the hydrological model consumes a lot of time and is often analyzed using only simple rainfall data. This study used GIS data and calculated the threshold rainfall from the threshold runoff causing flooding by coupling two hydrologic models. The calculation result was verified by comparing it with the actual case, and it was analyzed that damage occurred in the dangerous area in general. In the future, through this study, it will be possible to prepare for flood risk areas in advance, and it is expected that the accuracy will increase if machine learning analysis methods are added.

Vulnerability Assessment for Fine Particulate Matter (PM2.5) in the Schools of the Seoul Metropolitan Area, Korea: Part I - Predicting Daily PM2.5 Concentrations (인공지능을 이용한 수도권 학교 미세먼지 취약성 평가: Part I - 미세먼지 예측 모델링)

  • Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_2
    • /
    • pp.1881-1890
    • /
    • 2021
  • Particulate matter (PM) affects the human, ecosystems, and weather. Motorized vehicles and combustion generate fine particulate matter (PM2.5), which can contain toxic substances and, therefore, requires systematic management. Consequently, it is important to monitor and predict PM2.5 concentrations, especially in large cities with dense populations and infrastructures. This study aimed to predict PM2.5 concentrations in large cities using meteorological and chemical variables as well as satellite-based aerosol optical depth. For PM2.5 concentrations prediction, a random forest (RF) model showing excellent performance in PM concentrations prediction among machine learning models was selected. Based on the performance indicators R2, RMSE, MAE, and MAPE with training accuracies of 0.97, 3.09, 2.18, and 13.31 and testing accuracies of 0.82, 6.03, 4.36, and 25.79 for R2, RMSE, MAE, and MAPE, respectively. The variables used in this study showed high correlation to PM2.5 concentrations. Therefore, we conclude that these variables can be used in a random forest model to generate reliable PM2.5 concentrations predictions, which can then be used to assess the vulnerability of schools to PM2.5.

On the Effect of Air-Simulated Side-Jets on the Aerodynamic Characteristics of a Missile by Multi-Fidelity Modeling (다충실도 모형화를 통한 공기로 모사된 측방제트가 유도무기의 공력특성에 미치는 영향 연구)

  • Kang, Shinseong;Kang, Dayoung;Lee, Kyunghoon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.49 no.2
    • /
    • pp.95-106
    • /
    • 2021
  • Side-jets enable the immediate maneuver of a missile compared to control surfaces; however, they may cause adverse effects on aerodynamic coefficients, for they interfere with freestream. To find out the impact of side-jets on aerodynamic coefficients, we simulate side-jets as air gas and utilize multi-fidelity models to evaluate differences between aerodynamic coefficients obtained with and without side-jets. We computed differences in aerodynamic coefficients to investigate side-jet effects for the changes of a Mach number, a bank angle, and an angle of attack. As a result, asymmetrically developed side-jets affect the longitudinal force and moment coefficients, and the lateral force and moment coefficients drastically change in-between -30 and 30 degrees of bank angles. In contrast, side-jets hardly influence the axial force coefficients. As for the axial moment coefficient, we could not determine the side-jet effect due to a lack of aerodynamic coefficient samples in the Mach number. All in all, we confirm that side-jets lead to the change of a missile attitude as they considerably vary the longitudinal and lateral aerodynamic coefficients.

Estimation of Significant Wave Heights from X-Band Radar Based on ANN Using CNN Rainfall Classifier (CNN 강우여부 분류기를 적용한 ANN 기반 X-Band 레이다 유의파고 보정)

  • Kim, Heeyeon;Ahn, Kyungmo;Oh, Chanyeong
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.33 no.3
    • /
    • pp.101-109
    • /
    • 2021
  • Wave observations using a marine X-band radar are conducted by analyzing the backscattered radar signal from sea surfaces. Wave parameters are extracted using Modulation Transfer Function obtained from 3D wave number and frequency spectra which are calculated by 3D FFT of time series of sea surface images (42 images per minute). The accuracy of estimation of the significant wave height is, therefore, critically dependent on the quality of radar images. Wave observations during Typhoon Maysak and Haishen in the summer of 2020 show large errors in the estimation of the significant wave heights. It is because of the deteriorated radar images due to raindrops falling on the sea surface. This paper presents the algorithm developed to increase the accuracy of wave heights estimation from radar images by adopting convolution neural network(CNN) which automatically classify radar images into rain and non-rain cases. Then, an algorithm for deriving the Hs is proposed by creating different ANN models and selectively applying them according to the rain or non-rain cases. The developed algorithm applied to heavy rain cases during typhoons and showed critically improved results.

Domain Knowledge Incorporated Counterfactual Example-Based Explanation for Bankruptcy Prediction Model (부도예측모형에서 도메인 지식을 통합한 반사실적 예시 기반 설명력 증진 방법)

  • Cho, Soo Hyun;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.307-332
    • /
    • 2022
  • One of the most intensively conducted research areas in business application study is a bankruptcy prediction model, a representative classification problem related to loan lending, investment decision making, and profitability to financial institutions. Many research demonstrated outstanding performance for bankruptcy prediction models using artificial intelligence techniques. However, since most machine learning algorithms are "black-box," AI has been identified as a prominent research topic for providing users with an explanation. Although there are many different approaches for explanations, this study focuses on explaining a bankruptcy prediction model using a counterfactual example. Users can obtain desired output from the model by using a counterfactual-based explanation, which provides an alternative case. This study introduces a counterfactual generation technique based on a genetic algorithm (GA) that leverages both domain knowledge (i.e., causal feasibility) and feature importance from a black-box model along with other critical counterfactual variables, including proximity, distribution, and sparsity. The proposed method was evaluated quantitatively and qualitatively to measure the quality and the validity.