• Title/Summary/Keyword: 비모수적 기법

Search Result 131, Processing Time 0.027 seconds

Estimation of the Input Wave Height of the Wave Generator for Regular Waves by Using Artificial Neural Networks and Gaussian Process Regression (인공신경망과 가우시안 과정 회귀에 의한 규칙파의 조파기 입력파고 추정)

  • Jung-Eun, Oh;Sang-Ho, Oh
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.34 no.6
    • /
    • pp.315-324
    • /
    • 2022
  • The experimental data obtained in a wave flume were analyzed using machine learning techniques to establish a model that predicts the input wave height of the wavemaker based on the waves that have experienced wave shoaling and to verify the performance of the established model. For this purpose, artificial neural network (NN), the most representative machine learning technique, and Gaussian process regression (GPR), one of the non-parametric regression analysis methods, were applied respectively. Then, the predictive performance of the two models was compared. The analysis was performed independently for the case of using all the data at once and for the case by classifying the data with a criterion related to the occurrence of wave breaking. When the data were not classified, the error between the input wave height at the wavemaker and the measured value was relatively large for both the NN and GPR models. On the other hand, if the data were divided into non-breaking and breaking conditions, the accuracy of predicting the input wave height was greatly improved. Among the two models, the overall performance of the GPR model was better than that of the NN model.

Estimation of the Regional Future Sea Level Rise Using Long-term Tidal Data in the Korean Peninsula (장기 조위자료를 이용한 한반도 권역별 미래 해수면 상승 추정)

  • Lee, Cheol-Eung;Kim, Sang Ug;Lee, Yeong Seob
    • Journal of Korea Water Resources Association
    • /
    • v.47 no.9
    • /
    • pp.753-766
    • /
    • 2014
  • The future mean sea level rise (MSLR) due to climate change in major harbors of Korean Peninsula has been estimated by some statistical methods in this article. Firstly, Mann-Kendall non-parametric trend test to find some trend in the observed long-term tidal data has been performed and also Bayesian change point analysis has been used also to detect the location of change points and their magnitude quantitatively. Especially, in this study, the results from Bayesian change point analysis have been applied to combine 4 future MSLR scenario projections with local MSLR data at 5 tidal gauges. This proposed procedure including Bayesian change point analysis results can improve the step for the determination of starting years of future MLSR scenario projections with 18.6-year lunar node tidal cycle and effectively consider local characteristics at each gauge. The final results by the proposed procedure in this study have shown that the future MSLR in Jeju region (Jeju tidal gauge) is in the largest increment and also the future MSLRs in Western region (Boryeong tidal gauge) and Southern region (Busan tidal gauge) are in the second largest one. Finally, it has been shown that the future MSLRs in Southern region (Yeosu tidal gauge) and Eastern region (Sokcho tidal gauge) seem to be in the relatively smallest growth among 5 gauges.

Effects of Financial College Tuition Support by Korean Parents using a Hierarchical Bayes Model (계층적 베이즈 모형을 이용한 대학등록금에 대한 부모님의 경제적 지원 영향 분석)

  • Oh, Man-Suk;Oh, Hyun Sook;Oh, Min Jung
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.2
    • /
    • pp.267-280
    • /
    • 2013
  • College tuition is a significant economic, social, and political issue in Korea. We conduct a Bayesian analysis of a hierarchical model to address the factors related to college tuition based on a survey data collected by Statistics Korea. A binary response variable is selected depending on if more than 70% of tuition costs are supported by parents, and a hierarchical Probit model is constructed with areas as groups. A set of explanatory variables is selected from a factor analysis of available variables in the survey. A Markov chain Monte Carlo algorithm is used to estimate parameters. From the analysis results, income and stress are significantly related to college tuition support from parents. Parents with high income tend to support children's college tuition and students with parents' financial support tend to be mentally less stressed; subsequently, this shows that the economic status of parents significantly affects the mental health of college students. Gender, a healthy life style, and college satisfaction are not significant factors. Comparing areas in terms of the degrees of correlation between stress/income and tuition support from parents, students in Kangwon-do are the most mentally stressed when parents' support is limited; in addition, the positive correlation between parents support and income is stronger in big cities compared to provincial areas.

Recent Changes in Bloom Dates of Robinia pseudoacacia and Bloom Date Predictions Using a Process-Based Model in South Korea (최근 12년간 아까시나무 만개일의 변화와 과정기반모형을 활용한 지역별 만개일 예측)

  • Kim, Sukyung;Kim, Tae Kyung;Yoon, Sukhee;Jang, Keunchang;Lim, Hyemin;Lee, Wi Young;Won, Myoungsoo;Lim, Jong-Hwan;Kim, Hyun Seok
    • Journal of Korean Society of Forest Science
    • /
    • v.110 no.3
    • /
    • pp.322-340
    • /
    • 2021
  • Due to climate change and its consequential spring temperature rise, flowering time of Robinia pseudoacacia has advanced and a simultaneous blooming phenomenon occurred in different regions in South Korea. These changes in flowering time became a major crisis in the domestic beekeeping industry and the demand for accurate prediction of flowering time for R. pseudoacacia is increasing. In this study, we developed and compared performance of four different models predicting flowering time of R. pseudoacacia for the entire country: a Single Model for the country (SM), Modified Single Model (MSM) using correction factors derived from SM, Group Model (GM) estimating parameters for each region, and Local Model (LM) estimating parameters for each site. To achieve this goal, the bloom date data observed at 26 points across the country for the past 12 years (2006-2017) and daily temperature data were used. As a result, bloom dates for the north central region, where spring temperature increase was more than two-fold higher than southern regions, have advanced and the differences compared with the southwest region decreased by 0.7098 days per year (p-value=0.0417). Model comparisons showed MSM and LM performed better than the other models, as shown by 24% and 15% lower RMSE than SM, respectively. Furthermore, validation with 16 additional sites for 4 years revealed co-krigging of LM showed better performance than expansion of MSM for the entire nation (RMSE: p-value=0.0118, Bias: p-value=0.0471). This study improved predictions of bloom dates for R. pseudoacacia and proposed methods for reliable expansion to the entire nation.

Generalized kernel estimating equation for panel estimation of small area unemployment rates (소지역 실업률의 패널추정을 위한 일반화커널추정방정식)

  • Shim, Jooyong;Kim, Youngwon;Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1199-1210
    • /
    • 2013
  • The high unemployment rate is one of the major problems in most countries nowadays. Hence, the demand for small area labor statistics has rapidly increased over the past few years. However, since sample surveys for producing official statistics are mainly designed for large areas, it is difficult to produce reliable statistics at the small area level due to small sample sizes. Most of existing studies about the small area estimation are related with the estimation of parameters based on cross-sectional data. By the way, since many official statistics are repeatedly collected at a regular interval of time, for instance, monthly, quarterly, or yearly, we need an alternative model which can handle this type of panel data. In this paper, we derive the generalized kernel estimating equation which can model time-dependency among response variables and handle repeated measurement or panel data. We compare the proposed estimating equation with the generalized linear model and the generalized estimating equation through simulation, and apply it to estimating the unemployment rates of 25 areas in Gyeongsangnam-do and Ulsan for 2005.

Validation of OMI HCHO with EOF and SVD over Tropical Africa (EOF와 SVD을 이용한 아프리카 지역에서 관측된 OMI HCHO 자료의 검증)

  • Kim, J.H.;Baek, K.H.;Kim, S.M.
    • Korean Journal of Remote Sensing
    • /
    • v.30 no.4
    • /
    • pp.417-430
    • /
    • 2014
  • We have found an error in the operational OMI HCHO columns, and corrected it by applying a background parameterization derived on a 4th order polynomial fit to the time series of monthly average OMI HCHO data. The corrected OMI HCHO agrees with this understanding as well as with the other sensors measurements and has no unrealistic trends. A new scientific approach, statistical analyses with EOF and SVD, was adapted to reanalyze the consistency of the corrected OMI HCHO with other satellite measurements of HCHO, CO, $NO_2$, and fire counts over Africa. The EOF and SVD analyses with MOPITT CO, OMI $NO_2$, SCIAMAHCY, and OMI HCHO show the overall spatial and temporal pattern consistent with those of biomass burning over these regions. However, some discrepancies were observed from OMI HCHO over northern equatorial Africa during the northern biomass burning seasons: The maximum HCHO was found further downwind from where maximum fire counts occur and the minimum was found in January when biomass burning is strongest. The statistical analysis revealed that the influence of biogenic activity on HCHO wasn't strong enough to cause the discrepancies, but it is caused by the error in OMI HCHO from using the wrong Air Mass Factor (AMF) associated with biomass burning aerosol. If the error is properly taken into consideration, the biomass burning is the strongest source of HCHO seasonality over the regions. This study suggested that the statistical tools are a very efficient method for evaluating satellite data.

Analysis of the Efficiency of Gyeonggi-do Senior Welfare Centers by DEA Model (DEA를 이용한 경기도 노인복지관 효율성 분석)

  • Kim, Keum Hwan;Pak, Ae Kyung;Ryu, Seo Hyun;Lee, Nam Sik
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.8 no.3
    • /
    • pp.165-177
    • /
    • 2013
  • The purpose of this study was to examine the efficiency of senior welfare centers and the cause of differences among senior welfare centers in that regard, and to investigate influential factors for the differences in efficiency and the size of the influence of the factors. What methods would be effective at assessing the efficiency of senior welfare centers by taking into account their circumstances was reviewed, andpost-hoc analyses were made by using data envelopment analysis(DEA) and DAE/AP Modified prosthetic which were useful tools to evaluate relative efficiency. After 20 senior welfare centers located in Gyeonggi-do were selected, their yearly operating data of 2009 were utilized. The purpose of this study was to examine the efficiency of senior welfare centers. The evaluation data released by the Gyeonggi Welfare Foundation were analyzed by DEA, which is one of nonparametric statistics, and it was possible to obtain significant results on the regional operating efficiency of social welfare centers in 14 metropolitan cities and provinces, the causes and degree of their inefficiency and what areas one could refer to. As the data for the counties were utilized in this study, it's not quite possible to produce accurate results on the relative efficiency of senior welfare centers, but this study could be said to be of significance in that it suggested how to evaluate the overall operating efficiency of senior welfare centers in the counties involving the degree of their operating inefficiency, what improvements should be made and what reference groups there might be and provided information on the usefulness of the DEA model.

  • PDF

Data Augmentation using a Kernel Density Estimation for Motion Recognition Applications (움직임 인식응용을 위한 커널 밀도 추정 기반 학습용 데이터 증폭 기법)

  • Jung, Woosoon;Lee, Hyung Gyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.4
    • /
    • pp.19-27
    • /
    • 2022
  • In general, the performance of ML(Machine Learning) application is determined by various factors such as the type of ML model, the size of model (number of parameters), hyperparameters setting during the training, and training data. In particular, the recognition accuracy of ML may be deteriorated or experienced overfitting problem if the amount of dada used for training is insufficient. Existing studies focusing on image recognition have widely used open datasets for training and evaluating the proposed ML models. However, for specific applications where the sensor used, the target of recognition, and the recognition situation are different, it is necessary to build the dataset manually. In this case, the performance of ML largely depends on the quantity and quality of the data. In this paper, training data used for motion recognition application is augmented using the kernel density estimation algorithm which is a type of non-parametric estimation method. We then compare and analyze the recognition accuracy of a ML application by varying the number of original data, kernel types and augmentation rate used for data augmentation. Finally experimental results show that the recognition accuracy is improved by up to 14.31% when using the narrow bandwidth Tophat kernel.

A Simulation of Agro-Climate Index over the Korean Peninsula Using Dynamical Downscaling with a Numerical Weather Prediction Model (수치예보모형을 이용한 역학적 규모축소 기법을 통한 농업기후지수 모사)

  • Ahn, Joong-Bae;Hur, Ji-Na;Shim, Kyo-Moon
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.12 no.1
    • /
    • pp.1-10
    • /
    • 2010
  • A regional climate model (RCM) can be a powerful tool to enhance spatial resolution of climate and weather information (IPCC, 2001). In this study we conducted dynamical downscaling using Weather Research and Forecasting Model (WRF) as a RCM in order to obtain high resolution regional agroclimate indices over the Korean Peninsula. For the purpose of obtaining detailed high resolution agroclimate indices, we first reproduced regional weather for the period of March to June, 2002-2008 with dynamic downscaling method under given lateral boundary conditions from NCEP/NCAR (National Centers for Environmental Prediction/National Center for Atmospheric Research) reanalysis data. Normally, numerical model results have shown biases against observational results due to the uncertainties in the modelis initial conditions, physical parameterizations and our physical understanding on nature. Hence in this study, by employing a statistical method, the systematic bias in the modelis results was estimated and corrected for better reproduction of climate on high resolution. As a result of the correction, the systematic bias of the model was properly corrected and the overall spatial patterns in the simulation were well reproduced, resulting in more fine-resolution climatic structures. Based on these results, the fine-resolution agro-climate indices were estimated and presented. Compared with the indices derived from observation, the simulated indices reproduced the major and detailed spatial distributions. Our research shows a possibility to simulate regional climate on high resolution and agro-climate indices by using a proper downscaling method with a dynamical weather forecast model and a statistical correction method to minimize the model bias.

A Study on Risk Parity Asset Allocation Model with XGBoos (XGBoost를 활용한 리스크패리티 자산배분 모형에 관한 연구)

  • Kim, Younghoon;Choi, HeungSik;Kim, SunWoong
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.135-149
    • /
    • 2020
  • Artificial intelligences are changing world. Financial market is also not an exception. Robo-Advisor is actively being developed, making up the weakness of traditional asset allocation methods and replacing the parts that are difficult for the traditional methods. It makes automated investment decisions with artificial intelligence algorithms and is used with various asset allocation models such as mean-variance model, Black-Litterman model and risk parity model. Risk parity model is a typical risk-based asset allocation model which is focused on the volatility of assets. It avoids investment risk structurally. So it has stability in the management of large size fund and it has been widely used in financial field. XGBoost model is a parallel tree-boosting method. It is an optimized gradient boosting model designed to be highly efficient and flexible. It not only makes billions of examples in limited memory environments but is also very fast to learn compared to traditional boosting methods. It is frequently used in various fields of data analysis and has a lot of advantages. So in this study, we propose a new asset allocation model that combines risk parity model and XGBoost machine learning model. This model uses XGBoost to predict the risk of assets and applies the predictive risk to the process of covariance estimation. There are estimated errors between the estimation period and the actual investment period because the optimized asset allocation model estimates the proportion of investments based on historical data. these estimated errors adversely affect the optimized portfolio performance. This study aims to improve the stability and portfolio performance of the model by predicting the volatility of the next investment period and reducing estimated errors of optimized asset allocation model. As a result, it narrows the gap between theory and practice and proposes a more advanced asset allocation model. In this study, we used the Korean stock market price data for a total of 17 years from 2003 to 2019 for the empirical test of the suggested model. The data sets are specifically composed of energy, finance, IT, industrial, material, telecommunication, utility, consumer, health care and staple sectors. We accumulated the value of prediction using moving-window method by 1,000 in-sample and 20 out-of-sample, so we produced a total of 154 rebalancing back-testing results. We analyzed portfolio performance in terms of cumulative rate of return and got a lot of sample data because of long period results. Comparing with traditional risk parity model, this experiment recorded improvements in both cumulative yield and reduction of estimated errors. The total cumulative return is 45.748%, about 5% higher than that of risk parity model and also the estimated errors are reduced in 9 out of 10 industry sectors. The reduction of estimated errors increases stability of the model and makes it easy to apply in practical investment. The results of the experiment showed improvement of portfolio performance by reducing the estimated errors of the optimized asset allocation model. Many financial models and asset allocation models are limited in practical investment because of the most fundamental question of whether the past characteristics of assets will continue into the future in the changing financial market. However, this study not only takes advantage of traditional asset allocation models, but also supplements the limitations of traditional methods and increases stability by predicting the risks of assets with the latest algorithm. There are various studies on parametric estimation methods to reduce the estimated errors in the portfolio optimization. We also suggested a new method to reduce estimated errors in optimized asset allocation model using machine learning. So this study is meaningful in that it proposes an advanced artificial intelligence asset allocation model for the fast-developing financial markets.