• Title/Summary/Keyword: conditional distribution

Search Result 295, Processing Time 0.032 seconds

Local Uncertainty of Thickness of Consolidation Layer for Songdo New City (송도신도시 압밀층 두께의 국부적 불확실성 평가)

  • Kim, Dong-Hee;Ryu, Dong-Woo;Chae, Young-Ho;Lee, Woo-Jin
    • Journal of the Korean Geotechnical Society
    • /
    • v.28 no.1
    • /
    • pp.17-27
    • /
    • 2012
  • Since geologic data are often sampled at sparse locations, it is important not only to predict attribute values at unsampled locations but also to assess the uncertainty attached to the prediction. In this study the assessment of the local uncertainty of prediction for the thickness of the consolidation layer was performed by using the indicator approach. A conditional cumulative distribution function (ccdf) was first modeled, and then E-type estimates and the conditional variance were computed for the spatial distribution of the thickness of the consolidation layer. These results could be used to estimate the spatial distribution of secondary compression and to assess the local uncertainty of secondary compression for Songdo New City.

A Deep Learning Based Over-Sampling Scheme for Imbalanced Data Classification (불균형 데이터 분류를 위한 딥러닝 기반 오버샘플링 기법)

  • Son, Min Jae;Jung, Seung Won;Hwang, Een Jun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.7
    • /
    • pp.311-316
    • /
    • 2019
  • Classification problem is to predict the class to which an input data belongs. One of the most popular methods to do this is training a machine learning algorithm using the given dataset. In this case, the dataset should have a well-balanced class distribution for the best performance. However, when the dataset has an imbalanced class distribution, its classification performance could be very poor. To overcome this problem, we propose an over-sampling scheme that balances the number of data by using Conditional Generative Adversarial Networks (CGAN). CGAN is a generative model developed from Generative Adversarial Networks (GAN), which can learn data characteristics and generate data that is similar to real data. Therefore, CGAN can generate data of a class which has a small number of data so that the problem induced by imbalanced class distribution can be mitigated, and classification performance can be improved. Experiments using actual collected data show that the over-sampling technique using CGAN is effective and that it is superior to existing over-sampling techniques.

Application of Indicator Geostatistics for Probabilistic Uncertainty and Risk Analyses of Geochemical Data (지화학 자료의 확률론적 불확실성 및 위험성 분석을 위한 지시자 지구통계학의 응용)

  • Park, No-Wook
    • Journal of the Korean earth science society
    • /
    • v.31 no.4
    • /
    • pp.301-312
    • /
    • 2010
  • Geochemical data have been regarded as one of the important environmental variables in the environmental management. Since they are often sampled at sparse locations, it is important not only to predict attribute values at unsampled locations, but also to assess the uncertainty attached to the prediction for further analysis. The main objective of this paper is to exemplify how indicator geostatistics can be effectively applied to geochemical data processing for providing decision-supporting information as well as spatial distribution of the geochemical data. A whole geostatistical analysis framework, which includes probabilistic uncertainty modeling, classification and risk analysis, was illustrated through a case study of cadmium mapping. A conditional cumulative distribution function (ccdf) was first modeled by indicator kriging, and then e-type estimates and conditional variance were computed for spatial distribution of cadmium and quantitative uncertainty measures, respectively. Two different classification criteria such as a probability thresholding and an attribute thresholding were applied to delineate contaminated and safe areas. Finally, additional sampling locations were extracted from the coefficient of variation that accounts for both the conditional variance and the difference between attribute values and thresholding values. It is suggested that the indicator geostatistical framework illustrated in this study be a useful tool for analyzing any environmental variables including geochemical data for decision-making in the presence of uncertainty.

ON A CHARACTERIZATION OF THE EXPONENTIAL DISTRIBUTION BY CONDITIONAL EXPECTATIONS OF RECORD VALUES

  • Lee, Min-Young
    • Communications of the Korean Mathematical Society
    • /
    • v.16 no.2
    • /
    • pp.287-290
    • /
    • 2001
  • Let X$_1$, X$_2$, … be a sequence of independent and identically distributed random variables with continuous cumulative distribution function F(x). X(sub)j is an upper record value of this sequence if X(sub)j > max {X$_1$, X$_2$, …, X(sub)j-1}. We define u(n) = min {j│j > u(n-1), X(sub)j > X(sub)u(n-1), n $\geq$ 2} with u(1) = 1. Then F(x) = 1 - e(sup)-x/c, x > 0 if and only if E[X(sub)n(n+1) - X(sub)u(n)│X(sub)u(m) = y] = c or E[X(sub)u(n+2) - X(sub)u(n)│X(sub)u(m) = y] = 2c, n $\geq$ m+1.

  • PDF

Prediction of Soot Emissions and Particle Size distribution by KIVA3V and SWEEP in a diesel engine (KIVA3V와 SWEEP을 이용한 디젤 엔진에서의 soot 총량 및 입자 크기 분포 예측)

  • Lee, Jaeseo;Huh, Kang Y.
    • 한국연소학회:학술대회논문집
    • /
    • 2012.11a
    • /
    • pp.129-132
    • /
    • 2012
  • Computation is performed to predict number density, volume fraction and size distribution of soot particles in typical operating conditions of a diesel engine. KIVA has been integrated with the CMC routine to consider turbulence/chemistry coupling and gas phase kinetics for heat release and soot precursors. The compositions of soot precursors are estimated by tracking Lagrangian particles to consider spatial inhomogeneity and differential diffusion in KIVA. The soot simulator SWEEP is employed as a postprocessing step to calculate conditional and integral quantities of soot particles.

  • PDF

Bayesian Estimation of k-Population Weibull Distribution Under Ordered Scale Parameters (순서를 갖는 척도모수들의 사전정보 하에 k-모집단 와이블분포의 베이지안 모수추정)

  • 손영숙;김성욱
    • The Korean Journal of Applied Statistics
    • /
    • v.16 no.2
    • /
    • pp.273-282
    • /
    • 2003
  • The problem of estimating the parameters of k-population Weibull distributions is discussed under the prior of ordered scale parameters. Parameters are estimated by the Gibbs sampling method. Since the conditional posterior distribution of the shape parameter in the Gibbs sampler is not log-concave, the shape parameter is generated by the adaptive rejection sampling. Finally, we applied this estimation methodology to the data discussed in Nelson (1970).

A Bayesian Approach to Linear Calibration Design Problem

  • Kim, Sung-Chul
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.20 no.3
    • /
    • pp.105-122
    • /
    • 1995
  • Based on linear models, the inference about the true measurement x$_{f}$ and the optimal designs x (nx1) for the calibration experiments are considered via Baysian statistical decision analysis. The posterior distribution of x$_{f}$ given the observation y$_{f}$ (qxl) and the calibration experiment is obtained with normal priors for x$_{f}$ and for themodel parameters (.alpha., .betha.). This posterior distribution is not in the form of any known distributions, which leads to the use of a numerical integration or an approximation for the calculation of the overall expected loss. The general structure of the expected loss function is characterized in the form of a conjecture. A near-optimal design is obtained through the approximation nof the conditional covariance matrix of the joint distribution of (x$_{f}$ , y$_{f}$ $^{T}$ )$^{T}$ . Numerical results for the univariate case are given to demonstrate the conjecture and to evaluate the approximation.n.

  • PDF

A fast approximate fitting for mixture of multivariate skew t-distribution via EM algorithm

  • Kim, Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.255-268
    • /
    • 2020
  • A mixture of multivariate canonical fundamental skew t-distribution (CFUST) has been of interest in various fields. In particular, interest in the unsupervised learning society is noteworthy. However, fitting the model via EM algorithm suffers from significant processing time. The main cause is due to the calculation of many multivariate t-cdfs (cumulative distribution functions) in E-step. In this article, we provide an approximate, but fast calculation method for the in univariate fashion, which is the product of successively conditional univariate t-cdfs with Taylor's first order approximation. By replacing all multivariate t-cdfs in E-step with the proposed approximate versions, we obtain the admissible results of fitting the model, where it gives 85% reduction time for the 5 dimensional skewness case of the Australian Institution Sport data set. For this approach, discussions about rough properties, advantages and limits are also presented.

A new extension of Lindley distribution: modified validation test, characterizations and different methods of estimation

  • Ibrahim, Mohamed;Yadav, Abhimanyu Singh;Yousof, Haitham M.;Goual, Hafida;Hamedani, G.G.
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.5
    • /
    • pp.473-495
    • /
    • 2019
  • In this paper, a new extension of Lindley distribution has been introduced. Certain characterizations based on truncated moments, hazard and reverse hazard function, conditional expectation of the proposed distribution are presented. Besides, these characterizations, other statistical/mathematical properties of the proposed model are also discussed. The estimation of the parameters is performed through different classical methods of estimation. Bayes estimation is computed under gamma informative prior under the squared error loss function. The performances of all estimation methods are studied via Monte Carlo simulations in mean square error sense. The potential of the proposed model is analyzed through two data sets. A modified goodness-of-fit test using the Nikulin-Rao-Robson statistic test is investigated via two examples and is observed that the new extension might be used as an alternative lifetime model.

Volatility of Export Volume and Export Value of Gwangyang Port (광양항의 수출물동량과 수출액의 변동성)

  • Mo, Soo-Won;Lee, Kwang-Bae
    • Journal of Korea Port Economic Association
    • /
    • v.31 no.1
    • /
    • pp.1-14
    • /
    • 2015
  • The standard GARCH model imposing symmetry on the conditional variance, tends to fail in capturing some important features of the data. This paper, hence, introduces the models capturing asymmetric effect. They are the EGARCH model and the GJR model. We provide the systematic comparison of volatility models focusing on the asymmetric effect of news on volatility. Specifically, three diagnostic tests are provided: the sign bias test, the negative size bias test, and the positive size bias test. This paper shows that there is significant evidence of GARCH-type process in the data, as shown by the test for the Ljung-Box Q statistic on the squared residual data. The estimated unconditional density function for squared residual is clearly skewed to the left and markedly leptokurtic when compared with the standard normal distribution. The observation of volatility clustering is also clearly reinforced by the plot of the squared value of residuals of export volume and values. The unconditional variance of both export volumes and export value indicates that large shocks of either sign tend to be followed by large shocks, and small shocks of either sign tend to follow small shocks. The estimated export volume news impact curve for the GARCH also suggests that $h_t$ is overestimated for large negative and positive shocks. The conditional variance equation of the GARCH model for export volumes contains two parameters ${\alpha}$ and ${\beta}$ that are insignificant, indicating that the GARCH model is a poor characterization of the conditional variance of export volumes. The conditional variance equation of the EGARCH model for export value, however, shows a positive sign of parameter ${\delta}$, which is contrary to our expectation, while the GJR model exhibits that parameters ${\alpha}$ and ${\beta}$ are insignificant, and ${\delta}$ is marginally significant. That indicates that the asymmetric volatility models are poor characterization of the conditional variance of export value. It is concluded that the asymmetric EGARCH and GJR model are appropriate in explaining the volatility of export volume, while the symmetric standard GARCH model is good for capturing the volatility.