• Title/Summary/Keyword: Maximum likelihood cross-validation

Search Result 10, Processing Time 0.027 seconds

Bandwidth selections based on cross-validation for estimation of a discontinuity point in density (교차타당성을 이용한 확률밀도함수의 불연속점 추정의 띠폭 선택)

  • Huh, Jib
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.765-775
    • /
    • 2012
  • The cross-validation is a popular method to select bandwidth in all types of kernel estimation. The maximum likelihood cross-validation, the least squares cross-validation and biased cross-validation have been proposed for bandwidth selection in kernel density estimation. In the case that the probability density function has a discontinuity point, Huh (2012) proposed a method of bandwidth selection using the maximum likelihood cross-validation. In this paper, two forms of cross-validation with the one-sided kernel function are proposed for bandwidth selection to estimate the location and jump size of the discontinuity point of density. These methods are motivated by the least squares cross-validation and the biased cross-validation. By simulated examples, the finite sample performances of two proposed methods with the one of Huh (2012) are compared.

Bandwidth selection for discontinuity point estimation in density (확률밀도함수의 불연속점 추정을 위한 띠폭 선택)

  • Huh, Jib
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.1
    • /
    • pp.79-87
    • /
    • 2012
  • In the case that the probability density function has a discontinuity point, Huh (2002) estimated the location and jump size of the discontinuity point based on the difference between the right and left kernel density estimators using the one-sided kernel function. In this paper, we consider the cross-validation, made by the right and left maximum likelihood cross-validations, for the bandwidth selection in order to estimate the location and jump size of the discontinuity point. This method is motivated by the one-sided cross-validation of Hart and Yi (1998). The finite sample performance is illustrated by simulated example.

Logistic Regression Method in Interval-Censored Data

  • Yun, Eun-Young;Kim, Jin-Mi;Ki, Choong-Rak
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.871-881
    • /
    • 2011
  • In this paper we propose a logistic regression method to estimate the survival function and the median survival time in interval-censored data. The proposed method is motivated by the data augmentation technique with no sacrifice in augmenting data. In addition, we develop a cross validation criterion to determine the size of data augmentation. We compare the proposed estimator with other existing methods such as the parametric method, the single point imputation method, and the nonparametric maximum likelihood estimator through extensive numerical studies to show that the proposed estimator performs better than others in the sense of the mean squared error. An illustrative example based on a real data set is given.

Improvement of Basis-Screening-Based Dynamic Kriging Model Using Penalized Maximum Likelihood Estimation (페널티 적용 최대 우도 평가를 통한 기저 스크리닝 기반 크리깅 모델 개선)

  • Min-Geun Kim;Jaeseung Kim;Jeongwoo Han;Geun-Ho Lee
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.36 no.6
    • /
    • pp.391-398
    • /
    • 2023
  • In this paper, a penalized maximum likelihood estimation (PMLE) method that applies a penalty to increase the accuracy of a basis-screening-based Kriging model (BSKM) is introduced. The maximum order and set of basis functions used in the BSKM are determined according to their importance. In this regard, the cross-validation error (CVE) for the basis functions is employed as an indicator of importance. When constructing the Kriging model (KM), the maximum order of basis functions is determined, the importance of each basis function is evaluated according to the corresponding maximum order, and finally the optimal set of basis functions is determined. This optimal set is created by adding basis functions one by one in order of importance until the CVE of the KM is minimized. In this process, the KM must be generated repeatedly. Simultaneously, hyper-parameters representing correlations between datasets must be calculated through the maximum likelihood evaluation method. Given that the optimal set of basis functions depends on such hyper-parameters, it has a significant impact on the accuracy of the KM. The PMLE method is applied to accurately calculate hyper-parameters. It was confirmed that the accuracy of a BSKM can be improved by applying it to Branin-Hoo problem.

Application of universal kriging for modeling a groundwater level distribution 2. Restricted maximum likelihood method (지하수위 분포 모델링을 위한 UNIVERSAL KRIGING의 응용 2. 제한적 최대 우도법)

  • 정상용
    • The Journal of Engineering Geology
    • /
    • v.3 no.1
    • /
    • pp.51-61
    • /
    • 1993
  • Restricted maximum likelihood(RML) method was used to determine the parameters of generalized covariance, and universal krigig with RML was applied to estimate a groundwater level distribution of nonstationarv random function. Universal kriging with RML was compared to IRF-k with weighted least squares method for the comparison of their accuracies. Cross validation shows that two methods have nearly the same ability for the estimation of groundwater levels. Scattergram of estimates versus true values and contour maps of groundwater levels have nearly the same results. The reason why two methods produced the same results is thought to be the non-Gaussian distribution and the snaall number of sample data.

  • PDF

Precipitation Analysis Based on Spatial Linear Regression Model (공간적 상관구조를 포함하는 선형회귀모형을 이용한 강수량 자료 분석)

  • Jung, Ji-Young;Jin, Seo-Hoon;Park, Man-Sik
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.6
    • /
    • pp.1093-1107
    • /
    • 2008
  • In this study, we considered linear regression model with various spatial dependency structures in order to make more reliable prediction of precipitation in South Korea. The prediction approaches are based on semi-variogram models fitted by least-squares estimation method and restricted maximum likelihood estimation method. We validated some candidate models from the two different estimation methods in terms of cross-validation and comparison between predicted values and observed values measured at different locations.

Estimating GARCH models using kernel machine learning (커널기계 기법을 이용한 일반화 이분산자기회귀모형 추정)

  • Hwang, Chang-Ha;Shin, Sa-Im
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.3
    • /
    • pp.419-425
    • /
    • 2010
  • Kernel machine learning is gaining a lot of popularities in analyzing large or high dimensional nonlinear data. We use this technique to estimate a GARCH model for predicting the conditional volatility of stock market returns. GARCH models are usually estimated using maximum likelihood (ML) procedures, assuming that the data are normally distributed. In this paper, we show that GARCH models can be estimated using kernel machine learning and that kernel machine has a higher predicting ability than ML methods and support vector machine, when estimating volatility of financial time series data with fat tail.

Fast Bayesian Inversion of Geophysical Data (지구물리 자료의 고속 베이지안 역산)

  • Oh, Seok-Hoon;Kwon, Byung-Doo;Nam, Jae-Cheol;Kee, Duk-Kee
    • Journal of the Korean Geophysical Society
    • /
    • v.3 no.3
    • /
    • pp.161-174
    • /
    • 2000
  • Bayesian inversion is a stable approach to infer the subsurface structure with the limited data from geophysical explorations. In geophysical inverse process, due to the finite and discrete characteristics of field data and modeling process, some uncertainties are inherent and therefore probabilistic approach to the geophysical inversion is required. Bayesian framework provides theoretical base for the confidency and uncertainty analysis for the inference. However, most of the Bayesian inversion require the integration process of high dimension, so massive calculations like a Monte Carlo integration is demanded to solve it. This method, though, seemed suitable to apply to the geophysical problems which have the characteristics of highly non-linearity, we are faced to meet the promptness and convenience in field process. In this study, by the Gaussian approximation for the observed data and a priori information, fast Bayesian inversion scheme is developed and applied to the model problem with electric well logging and dipole-dipole resistivity data. Each covariance matrices are induced by geostatistical method and optimization technique resulted in maximum a posteriori information. Especially a priori information is evaluated by the cross-validation technique. And the uncertainty analysis was performed to interpret the resistivity structure by simulation of a posteriori covariance matrix.

  • PDF

Fire Severity Mapping Using a Single Post-Fire Landsat 7 ETM+ Imagery (단일 시기의 Landsat 7 ETM+ 영상을 이용한 산불피해지도 작성)

  • 원강영;임정호
    • Korean Journal of Remote Sensing
    • /
    • v.17 no.1
    • /
    • pp.85-97
    • /
    • 2001
  • The KT(Kauth-Thomas) and IHS(Intensity-Hue-Saturation) transformation techniques were introduced and compared to investigate fire-scarred areas with single post-fire Landsat 7 ETM+ image. This study consists of two parts. First, using only geometrically corrected imagery, it was examined whether or not the different level of fire-damaged areas could be detected by simple slicing method within the image enhanced by the IHS transform. As a result, since the spectral distribution of each class on each IHS component was overlaid, the simple slicing method did not seem appropriate for the delineation of the areas of the different level of fire severity. Second, the image rectified by both radiometrically and topographically was enhanced by the KT transformation and the IHS transformation, respectively. Then, the images were classified by the maximum likelihood method. The cross-validation was performed for the compensation of relatively small set of ground truth data. The results showed that KT transformation produced better accuracy than IHS transformation. In addition, the KT feature spaces and the spectral distribution of IHS components were analyzed on the graph. This study has shown that, as for the detection of the different level of fire severity, the KT transformation reflects the ground physical conditions better than the IHS transformation.

Ordinary kriging approach to predicting long-term particulate matter concentrations in seven major Korean cities

  • Kim, Sun-Young;Yi, Seon-Ju;Eum, Young Seob;Choi, Hae-Jin;Shin, Hyesop;Ryou, Hyoung Gon;Kim, Ho
    • Environmental Analysis Health and Toxicology
    • /
    • v.29
    • /
    • pp.12.1-12.8
    • /
    • 2014
  • Objectives Cohort studies of associations between air pollution and health have used exposure prediction approaches to estimate individual-level concentrations. A common prediction method used in Korean cohort studies is ordinary kriging. In this study, performance of ordinary kriging models for long-term particulate matter less than or equal to $10{\mu}m$ in diameter ($PM_{10}$) concentrations in seven major Korean cities was investigated with a focus on spatial prediction ability. Methods We obtained hourly $PM_{10}$ data for 2010 at 226 urban-ambient monitoring sites in South Korea and computed annual average $PM_{10}$ concentrations at each site. Given the annual averages, we developed ordinary kriging prediction models for each of the seven major cities and for the entire country by using an exponential covariance reference model and a maximum likelihood estimation method. For model evaluation, cross-validation was performed and mean square error and R-squared ($R^2$) statistics were computed. Results Mean annual average $PM_{10}$ concentrations in the seven major cities ranged between 45.5 and $66.0{\mu}g/m^3$ (standard deviation=2.40 and $9.51{\mu}g/m^3$, respectively). Cross-validated $R^2$ values in Seoul and Busan were 0.31 and 0.23, respectively, whereas the other five cities had $R^2$ values of zero. The national model produced a higher cross-validated $R^2$ (0.36) than those for the city-specific models. Conclusions In general, the ordinary kriging models performed poorly for the seven major cities and the entire country of South Korea, but the model performance was better in the national model. To improve model performance, future studies should examine different prediction approaches that incorporate $PM_{10}$ source characteristics.