• 제목/요약/키워드: Performance-based Statistics

검색결과 1,048건 처리시간 0.023초

A Comparative Study on the Performance of Bayesian Partially Linear Models

  • Woo, Yoonsung;Choi, Taeryon;Kim, Wooseok
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.885-898
    • /
    • 2012
  • In this paper, we consider Bayesian approaches to partially linear models, in which a regression function is represented by a semiparametric additive form of a parametric linear regression function and a nonparametric regression function. We make a comparative study on the performance of widely used Bayesian partially linear models in terms of empirical analysis. Specifically, we deal with three Bayesian methods to estimate the nonparametric regression function, one method using Fourier series representation, the other method based on Gaussian process regression approach, and the third method based on the smoothness of the function and differencing. We compare the numerical performance of three methods by the root mean squared error(RMSE). For empirical analysis, we consider synthetic data with simulation studies and real data application by fitting each of them with three Bayesian methods and comparing the RMSEs.

The Role of Artificial Observations in Misclassified Binary Data with Common False-Positive Error

  • Lee, Seung-Chun
    • 응용통계연구
    • /
    • 제25권4호
    • /
    • pp.697-706
    • /
    • 2012
  • An Agresti-Coull type test is considered for the difference of binomial proportions in two doubly sampled data subject to common false-positive error. The performance of the test is compared with likelihood-based tests. The Agresti-Coull test has many desirable properties in that it can approximate the nominal significance level well, and has comparable power performance with a computational advantage.

Ensemble variable selection using genetic algorithm

  • Seogyoung, Lee;Martin Seunghwan, Yang;Jongkyeong, Kang;Seung Jun, Shin
    • Communications for Statistical Applications and Methods
    • /
    • 제29권6호
    • /
    • pp.629-640
    • /
    • 2022
  • Variable selection is one of the most crucial tasks in supervised learning, such as regression and classification. The best subset selection is straightforward and optimal but not practically applicable unless the number of predictors is small. In this article, we propose directly solving the best subset selection via the genetic algorithm (GA), a popular stochastic optimization algorithm based on the principle of Darwinian evolution. To further improve the variable selection performance, we propose to run multiple GA to solve the best subset selection and then synthesize the results, which we call ensemble GA (EGA). The EGA significantly improves variable selection performance. In addition, the proposed method is essentially the best subset selection and hence applicable to a variety of models with different selection criteria. We compare the proposed EGA to existing variable selection methods under various models, including linear regression, Poisson regression, and Cox regression for survival data. Both simulation and real data analysis demonstrate the promising performance of the proposed method.

SELECTION PROCEDURES TO SELECT POPULATIONS BETTER THAN A CONTROL

  • Kumar, Narinder;Khamnel, H.J.
    • Journal of the Korean Statistical Society
    • /
    • 제32권2호
    • /
    • pp.151-162
    • /
    • 2003
  • In this paper, we propose two selection procedures for selecting populations better than a control population. The bestness is defined in terms of location parameter. One of the procedures is based on two-sample linear rank statistics whereas the other one is based on a comparatively simple statistic, and is useful when testing time is expensive so that an early termination of an experiment is desirable. The proposed selection procedures are seen to be strongly monotone. Performance of the proposed procedures is assessed through simulation study.

Value at Risk Forecasting Based on Quantile Regression for GARCH Models

  • Lee, Sang-Yeol;Noh, Jung-Sik
    • 응용통계연구
    • /
    • 제23권4호
    • /
    • pp.669-681
    • /
    • 2010
  • Value-at-Risk(VaR) is an important part of risk management in the financial industry. This paper present a VaR forecasting for financial time series based on the quantile regression for GARCH models recently developed by Lee and Noh (2009). The proposed VaR forecasting features the direct conditional quantile estimation for GARCH models that is well connected with the model parameters. Empirical performance is measured by several backtesting procedures, and is reported in comparison with existing methods using sample quantiles.

비균일 환경에서 표적 검파를 위한 순서계통에 근거한 일정오경보율 검파기의 성능 해석 (Performance analysis of CFAR detectors based on order statistics for nonhomogeneous background)

  • 한동석
    • 한국통신학회논문지
    • /
    • 제22권7호
    • /
    • pp.1550-1558
    • /
    • 1997
  • In this paper, we first propose a modified OS CFAR detector called the order statistics cell averaging(OSCA) CFAR detector and anlyze its performance for a Rayleigh target in homogeneous backgrounds, clutter edges, and satistics smallest of(OSSO) CFAR detectors for a Rayleigh target to nonhomogeneous environments. Computer simulation results show that the OSCA CFAR detector has superior performance to OS, OSGO, and OSSO CFAR detectors in homogeneous and multiple target environments. And the proposed detector shows its robustness for fast detection because it requires falf the processing time of the OS CFAR detector.

  • PDF

Reconsideration of F1 Score as a Performance Measure in Mass Spectrometry-based Metabolomics

  • Jeong, Jaesik;Kim, Han Sol;Kim, Shin June
    • 통합자연과학논문집
    • /
    • 제11권3호
    • /
    • pp.161-164
    • /
    • 2018
  • Over the past decade, mass spectrometry-based metabolomics, especially two dimensional gas chromatography mass spectrometry (GCxGC/TOF-MS), has become a key analytical tool for metabolomics data because of its sensitivity and ability to analyze complex biological or biochemical sample. However, the need to reduce variations within/between experiments has been reported and methodological developments to overcome such problem has long been a critical issue. Along with methodological developments, developing reasonable performance measure has also been studied. Following four numerical measures have been typically used for comparison: sensitivity, specificity, receiver operating characteristic (ROC) curves, and positive predictive value (PPV). However, more recently, such measures are replaced with F1 score in many fields including metabolomics area without any carefulness of its validity. Thus, we want to investigate the validity of F1 score on two examples, with the goal of raising the awareness in choosing appropriate performance comparison measure. We noticed that F1 score itself, as a performance measure, was not good enough. Accordingly, we suggest that F1 score be supplemented with other performance measure such as specificity to improve its validity.

대학 입학전형별 학업성취도 연구 (A Study on the Performance Evaluation of the College-Entrance Processes)

  • 오정현;정재윤;홍영훈;박상규;김삼용
    • 응용통계연구
    • /
    • 제23권5호
    • /
    • pp.987-996
    • /
    • 2010
  • 대학 입학전형은 각 대학의 교육목적에 맞고 그 교육목표를 이룰 수 있는 적합한 인재상에 부합하는 학생들을 선발하는 것에 있다. 그리고 최근 입학사정관제의 도입으로 학생의 잠재력을 발굴하여 선발하는 전형을 정착시키려는 노력을 하고 있다. 이 점에서 과연 다양한 평가방식이 학생들의 질적 특성까지 반영하여 그들의 잠재력이 잘 평가되는지, 또 전형요소에 따라 계열, 모집단위 등의 특성에 맞는 학생들이 제대로 평가되고 있는지에 대하여 통계적 타당성을 검증할 필요성이 있다. 대학학업성취도를 바탕으로 하여 여러 요인들의 영향력을 통계적 방법으로 분석한다.

Prediction of extreme PM2.5 concentrations via extreme quantile regression

  • Lee, SangHyuk;Park, Seoncheol;Lim, Yaeji
    • Communications for Statistical Applications and Methods
    • /
    • 제29권3호
    • /
    • pp.319-331
    • /
    • 2022
  • In this paper, we develop a new statistical model to forecast the PM2.5 level in Seoul, South Korea. The proposed model is based on the extreme quantile regression model with lasso penalty. Various meteorological variables and air pollution variables are considered as predictors in the regression model, and the lasso quantile regression performs variable selection and solves the multicollinearity problem. The final prediction model is obtained by combining various extreme lasso quantile regression estimators and we construct a binary classifier based on the model. Prediction performance is evaluated through the statistical measures of the performance of a binary classification test. We observe that the proposed method works better compared to the other classification methods, and predicts 'very bad' cases of the PM2.5 level well.

영화 메타데이터의 증가에 따른 콘텐츠 기반 추천 시스템 성능 향상 (Performance Improvement of a Contents-based Recommendation System by Increasing Movie Metadata)

  • 서진경;최다정;백주련
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2022년도 제65차 동계학술대회논문집 30권1호
    • /
    • pp.23-26
    • /
    • 2022
  • OTT 서비스의 이용자가 폭발적으로 증가하고 있는 지금, 사용자에게 맞춤형 상품을 추천하는 것은 해당 서비스에서 중요한 사안이다. 본 논문에서는 콘텐츠 기반 추천 시스템의 모델을 제안하고, 영화 데이터를 추가 해가며 예측력을 높일 최종적인 모델을 채택하고자 한다. 이를 위해 GroupLens와 Kaggle에서 영화 데이터를 수집하고 총 1111개의 영화, 943명의 사용자에게서 나온 71026개의 영화 평가 데이터를 이용하였다. 모델 평가 결과, 장르와 키워드만을 이용한 추천 시스템 모델의 RMSE는 1.3076, 단계적으로 데이터를 추가해 최종적으로 장르, 키워드, 배우, 감독, 나라, 제작사를 이용한 추천 시스템 모델의 RMSE는 1.1870으로 모든 데이터를 추가한 모델의 예측력이 더 높았다. 이에 따라 장르, 키워드, 배우, 감독, 나라, 제작사를 이용해 구현한 모델을 최종적인 모델로 채택, 무작위로 추출한 한 명의 사용자에 대한 영화 추천 리스트를 뽑아낸다.

  • PDF