• Title/Summary/Keyword: mixture of extreme distributions

Search Result 10, Processing Time 0.031 seconds

Extreme Values of Mixed Erlang Random Variables (혼합 얼랑 확률변수의 극한치)

  • Kang, Sung-Yeol
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.28 no.4
    • /
    • pp.145-153
    • /
    • 2003
  • In this Paper, we examine the limiting distributional behaviour of extreme values of mixed Erlang random variables. We show that, in the finite mixture of Erlang distributions, the component distribution with an asymptotically dominant tail has a critical effect on the asymptotic extreme behavior of the mixture distribution and it converges to the Gumbel extreme-value distribution. Normalizing constants are also established. We apply this result to characterize the asymptotic distribution of maxima of sojourn times in M/M/s queuing system. We also show that Erlang mixtures with continuous mixing may converge to the Gumbel or Type II extreme-value distribution depending on their mixing distributions, considering two special cases of uniform mixing and exponential mixing.

A Study of Outlier Detection Using the Mixture of Extreme Distributions Based on Deep-Sea Fishery Data (원양어선 조업 데이터의 혼합 극단분포를 이용한 이상점 탐색 연구)

  • Lee, Jung Jin;Kim, Jae Kyoung
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.5
    • /
    • pp.847-858
    • /
    • 2015
  • Deep-sea fishery in the Antarctic Ocean has been actively progressed by the developed countries including Korea. In order to prevent the environmental destruction of the Antarctic Ocean, related countries have established the Commission for the Conservation of Antarctic Marine Living Resources (CCAMLR) and have monitored any illegal unreported or unregulated fishing. Fishing of tooth fish, an expensive fish, in the Antarctic Ocean has increased recently and high catches per unit effort (CPUE) of fishing boats, which is suspicious for an illegal activity, have been frequently reported. The data of CPUEs in a fishing area of the Antarctic Ocean often show an extreme Distribution or a mixture of two extreme distributions. This paper proposes an algorithm to detect an outlier of CPUEs by using the mixture of two extreme distributions. The parameters of the mixture distribution are estimated by the EM algorithm. Log likelihood value and posterior probabilities are used to detect an outlier. Experiments show that the proposed algorithm to detect outlier of the data can be adopted instead of simple criteria such as a CPUE is greater than 1.

Extreme value modeling of structural load effects with non-identical distribution using clustering

  • Zhou, Junyong;Ruan, Xin;Shi, Xuefei;Pan, Chudong
    • Structural Engineering and Mechanics
    • /
    • v.74 no.1
    • /
    • pp.55-67
    • /
    • 2020
  • The common practice to predict the characteristic structural load effects (LEs) in long reference periods is to employ the extreme value theory (EVT) for building limit distributions. However, most applications ignore that LEs are driven by multiple loading events and thus do not have the identical distribution, a prerequisite for EVT. In this study, we propose the composite extreme value modeling approach using clustering to (a) cluster initial blended samples into finite identical distributed subsamples using the finite mixture model, expectation-maximization algorithm, and the Akaike information criterion; (b) combine limit distributions of subsamples into a composite prediction equation using the generalized Pareto distribution based on a joint threshold. The proposed approach was validated both through numerical examples with known solutions and engineering applications of bridge traffic LEs on a long-span bridge. The results indicate that a joint threshold largely benefits the composite extreme value modeling, many appropriate tail approaching models can be used, and the equation form is simply the sum of the weighted models. In numerical examples, the proposed approach using clustering generated accurate extrema prediction of any reference period compared with the known solutions, whereas the common practice of employing EVT without clustering on the mixture data showed large deviations. Real-world bridge traffic LEs are driven by multi-events and present multipeak distributions, and the proposed approach is more capable of capturing the tendency of tailed LEs than the conventional approach. The proposed approach is expected to have wide applications to general problems such as samples that are driven by multiple events and that do not have the identical distribution.

A development of nonstationary rainfall frequency analysis model based on mixture distribution (혼합분포 기반 비정상성 강우 빈도해석 기법 개발)

  • Choi, Hong-Geun;Kwon, Hyun-Han;Park, Moon-Hyung
    • Journal of Korea Water Resources Association
    • /
    • v.52 no.11
    • /
    • pp.895-904
    • /
    • 2019
  • It has been well recognized that extreme rainfall process often features a nonstationary behavior, which may not be effectively modeled within a stationary frequency modeling framework. Moreover, extreme rainfall events are often described by a two (or more)-component mixture distribution which can be attributed to the distinct rainfall patterns associated with summer monsoons and tropical cyclones. In this perspective, this study explores a Mixture Distribution based Nonstationary Frequency (MDNF) model in a changing rainfall patterns within a Bayesian framework. Subsequently, the MDNF model can effectively account for the time-varying moments (e.g. location parameter) of the Gumbel distribution in a two (or more)-component mixture distribution. The performance of the MDNF model was evaluated by various statistical measures, compared with frequency model based on both stationary and nonstationary mixture distributions. A comparison of the results highlighted that the MDNF model substantially improved the overall performance, confirming the assumption that the extreme rainfall patterns might have a distinct nonstationarity.

Count Five Statistics Using Trimmed Mean

  • Hong, Chong-Sun;Jun, Jae-Woon
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.2
    • /
    • pp.309-318
    • /
    • 2006
  • There are many statistical methods of testing the equality of two population variances. Among them, the well-known F test is very sensitive to the normality assumption. Several other tests that do not assume normality have been proposed, but these tests usually need tables of critical values or software for hypotheses testing. McGrath and Yeh (2005) suggested a quick and compact Count Five test requiring only the calculation of the number of extreme points. Since the Count Five test uses only extreme values, this discards some information from the samples, often resulting in a degradation in power. In this paper, an alternative Count Five test using the trimmed mean is proposed and its properties are discussed for some distributions and normal mixtures.

Prediction of sharp change of particulate matter in Seoul via quantile mapping

  • Jeongeun Lee;Seoncheol Park
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.3
    • /
    • pp.259-272
    • /
    • 2023
  • In this paper, we suggest a new method for the prediction of sharp changes in particulate matter (PM10) using quantile mapping. To predict the current PM10 density in Seoul, we consider PM10 and precipitation in Baengnyeong and Ganghwa monitoring stations observed a few hours before. For the PM10 distribution estimation, we use the extreme value mixture model, which is a combination of conventional probability distributions and the generalized Pareto distribution. Furthermore, we also consider a quantile generalized additive model (QGAM) for the relationship modeling between precipitation and PM10. To prove the validity of our proposed model, we conducted a simulation study and showed that the proposed method gives lower mean absolute differences. Real data analysis shows that the proposed method could give a more accurate prediction when there are sharp changes in PM10 in Seoul.

Regional Analysis of Particulate Matter Concentration Risk in South Korea (국내 지역별 미세먼지 농도 리스크 분석)

  • Oh, Jang Wook;Lim, Tea Jin
    • Journal of the Korean Society of Safety
    • /
    • v.32 no.5
    • /
    • pp.157-167
    • /
    • 2017
  • Millions of People die every year from diseases caused by exposure to outdoor air pollution. Especially, one of the most severe types of air pollution is fine particulate matter (PM10, PM2.5). South Korea also has been suffered from severe PM. This paper analyzes regional risks induced by PM10 and PM2.5 that have affected domestic area of Korea during 2014~2016.3Q. We investigated daily maxima of PM10 and PM2.5 data observed on 284 stations in South Korea, and found extremely high outlier. We employed extreme value distributions to fit the PM10 and PM2.5 data, but a single distribution did not fit the data well. For theses reasons, we implemented extreme mixture models such as the generalized Pareto distribution(GPD) with the normal, the gamma, the Weibull and the log-normal, respectively. Next, we divided the whole area into 16 regions and analyzed characteristics of PM risks by developing the FN-curves. Finally, we estimated 1-month, 1-quater, half year, 1-year and 3-years period return levels, respectively. The severity rankings of PM10 and PM2.5 concentration turned out to be different from region to region. The capital area revealed the worst PM risk in all seasons. The reason for high PM risk even in the yellow dust free season (Jun. ~ Sep.) can be inferred from the concentration of factories in this area. Gwangju showed the highest return level of PM2.5, even if the return level of PM10 was relatively low. This phenomenon implies that we should investigate chemical mechanisms for making PM2.5 in the vicinity of Gwangju area. On the other hand, Gyeongbuk and Ulsan exposed relatively high PM10 risk and low PM2.5 risk. This indicates that the management policy of PM risk in the west side should be different from that in the east side. The results of this research may provide insights for managing regional risks induced by PM10 and PM2.5 in South Korea.

Density estimation of summer extreme temperature over South Korea using mixtures of conditional autoregressive species sampling model (혼합 조건부 종추출모형을 이용한 여름철 한국지역 극한기온의 위치별 밀도함수 추정)

  • Jo, Seongil;Lee, Jaeyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1155-1168
    • /
    • 2016
  • This paper considers a probability density estimation problem of climate values. In particular, we focus on estimating probability densities of summer extreme temperature over South Korea. It is known that the probability density of climate values at one location is similar to those at near by locations and one doesn't follow well known parametric distributions. To accommodate these properties, we use a mixture of conditional autoregressive species sampling model, which is a nonparametric Bayesian model with a spatial dependency. We apply the model to a dataset consisting of summer maximum temperature and minimum temperature over South Korea. The dataset is obtained from University of East Anglia.

Performance Analysis of Economic VaR Estimation using Risk Neutral Probability Distributions

  • Heo, Se-Jeong;Yeo, Sung-Chil;Kang, Tae-Hun
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.757-773
    • /
    • 2012
  • Traditional value at risk(S-VaR) has a difficulity in predicting the future risk of financial asset prices since S-VaR is a backward looking measure based on the historical data of the underlying asset prices. In order to resolve the deficiency of S-VaR, an economic value at risk(E-VaR) using the risk neutral probability distributions is suggested since E-VaR is a forward looking measure based on the option price data. In this study E-VaR is estimated by assuming the generalized gamma distribution(GGD) as risk neutral density function which is implied in the option. The estimated E-VaR with GGD was compared with E-VaR estimates under the Black-Scholes model, two-lognormal mixture distribution, generalized extreme value distribution and S-VaR estimates under the normal distribution and GARCH(1, 1) model, respectively. The option market data of the KOSPI 200 index are used in order to compare the performances of the above VaR estimates. The results of the empirical analysis show that GGD seems to have a tendency to estimate VaR conservatively; however, GGD is superior to other models in the overall sense.

A Bayesian Approach to Gumbel Mixture Distribution for the Estimation of Parameter and its use to the Rainfall Frequency Analysis (Bayesian 기법을 이용한 혼합 Gumbel 분포 매개변수 추정 및 강우빈도해석 기법 개발)

  • Choi, Hong-Geun;Uranchimeg, Sumiya;Kim, Yong-Tak;Kwon, Hyun-Han
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.38 no.2
    • /
    • pp.249-259
    • /
    • 2018
  • More than half of annual rainfall occurs in summer season in Korea due to its climate condition and geographical location. A frequency analysis is mostly adopted for designing hydraulic structure under the such concentrated rainfall condition. Among the various distributions, univariate Gumbel distribution has been routinely used for rainfall frequency analysis in Korea. However, the distributional changes in extreme rainfall have been globally observed including Korea. More specifically, the univariate Gumbel distribution based rainfall frequency analysis is often fail to describe multimodal behaviors which are mainly influenced by distinct climate conditions during the wet season. In this context, we purposed a Gumbel mixture distribution based rainfall frequency analysis with a Bayesian framework, and further the results were compared to that of the univariate. It was found that the proposed model showed better performance in describing underlying distributions, leading to the lower Bayesian information criterion (BIC) values. The mixed Gumbel distribution was more robust for describing the upper tail of the distribution which playes a crucial role in estimating more reliable estimates of design rainfall uncertainty occurred by peak of upper tail than single Gumbel distribution. Therefore, it can be concluded that the mixed Gumbel distribution is more compatible for extreme frequency analysis rainfall data with two or more peaks on its distribution.