• 제목/요약/키워드: maximum entropy models

검색결과 45건 처리시간 0.028초

Maximum entropy test for infinite order autoregressive models

  • Lee, Sangyeol;Lee, Jiyeon;Noh, Jungsik
    • Journal of the Korean Data and Information Science Society
    • /
    • 제24권3호
    • /
    • pp.637-642
    • /
    • 2013
  • In this paper, we consider the maximum entropy test in in nite order autoregressiv models. Its asymptotic distribution is derived under the null hypothesis. A bootstrap version of the test is discussed and its performance is evaluated through Monte Carlo simulations.

Discriminant Analysis of Binary Data by Using the Maximum Entropy Distribution

  • Lee, Jung Jin;Hwang, Joon
    • Communications for Statistical Applications and Methods
    • /
    • 제10권3호
    • /
    • pp.909-917
    • /
    • 2003
  • Although many classification models have been used to classify binary data, none of the classification models dominates all varying circumstances depending on the number of variables and the size of data(Asparoukhov and Krzanowski (2001)). This paper proposes a classification model which uses information on marginal distributions of sub-variables and its maximum entropy distribution. Classification experiments by using simulation are discussed.

최대 엔트로피 부스팅 모델을 이용한 영어 전치사구 접속과 품사 결정 모호성 해소 ((Resolving Prepositional Phrase Attachment and POS Tagging Ambiguities using a Maximum Entropy Boosting Model))

  • 박성배
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제30권5_6호
    • /
    • pp.570-578
    • /
    • 2003
  • 최대 엔트로피 모델은 자연언어를 모델링하기 위한 좋은 방법이다. 하지만, 최대 엔트로피 모델을 전치사구 접속과 같은 실제 언어 문제에 적용할 때, 자질 선택과 계산 복잡도의 두 가지 문제가 발생한다. 본 논문에서는, 이런 문제와 자연언어 자원에 존재하는 불균형 데이터 문제를 해결하기 위한 최대 엔트로피 부스팅 모델(maximum entropy boosting model)을 제시하고, 이를 영어의 전치사구 접속과 품사 결정 모호성 해소에 적용한다. Wall Street Journal 말뭉치에 대한 실험 결과, 문제의 모델링에 아주 작은 노력을 들였음에도 불구하고, 전치사구 접속 문제에 대해 84.3%의 정확도와 품사 결정 문제에 대해 96.78%의 정확도를 보여 지금까지 알려진 최고의 성능과 비슷한 결과를 보였다.

Discriminant Analysis of Binary Data with Multinomial Distribution by Using the Iterative Cross Entropy Minimization Estimation

  • Lee Jung Jin
    • Communications for Statistical Applications and Methods
    • /
    • 제12권1호
    • /
    • pp.125-137
    • /
    • 2005
  • Many discriminant analysis models for binary data have been used in real applications, but none of the classification models dominates in all varying circumstances(Asparoukhov & Krzanowski(2001)). Lee and Hwang (2003) proposed a new classification model by using multinomial distribution with the maximum entropy estimation method. The model showed some promising results in case of small number of variables, but its performance was not satisfactory for large number of variables. This paper explores to use the iterative cross entropy minimization estimation method in replace of the maximum entropy estimation. Simulation experiments show that this method can compete with other well known existing classification models.

우리나라 멸치자원량추정을 위한 잉여생산모델과 최대엔트로피모델의 비교분석 (A Comparative Analysis of Surplus Production Models and a Maximum Entropy Model for Estimating the Anchovy's Stock in Korea)

  • 표희동
    • 수산해양교육연구
    • /
    • 제18권1호
    • /
    • pp.19-30
    • /
    • 2006
  • For fishery stock assessment and optimum sustainable yield of anchovy in Korea, surplus production(SP) models and a maximum entropy(ME) model are employed in this paper. For determining appropriate models, five traditional SP models-Schaefer model, Schnute model, Walters and Hilborn model, Fox model, and Clarke, Yoshimoto and Pooley (CYP) model- are tested for effort and catch data of anchovy that occupies 7% in the total fisheries landings of Korea. Only CYP model of five SP models fits statistically significant at the 10% level. Estimated intrinsic growth rates are similar in both CYP and ME models, while environmental carrying capacity of the ME model is quite greater than that of the CYP model. In addition, the estimated maximum sustainable yield(MSY), 213,287 tons in the ME model is slightly higher than that of CYP model (198,364 tons). Biomass for MSY in the ME model, however, is calculated 651,000 tons which is considerably greater than that of the CYP model (322,881 tons). It is meaningful in that two models are compared for noting some implications about any significant difference of stock assessment and their potential strength and weakness.

통합생산량분석법에 의한 한국 서해 어획대상 잠재생산량 추정 연구 (A study on the estimation of potential yield for Korean west coast fisheries using the holistic production method (HPM))

  • 김현아;서영일;차형기;강희중;장창익
    • 수산해양기술연구
    • /
    • 제54권1호
    • /
    • pp.38-53
    • /
    • 2018
  • The purpose of this study is to estimate potential yield (PY) for Korean west coast fisheries using the holistic production method (HPM). HPM involves the use of surplus production models to apply input data of catch and standardized fishing efforts. HPM compared the estimated parameters of the surplus production from four different models: the Fox model, CYP model, ASPIC model, and maximum entropy model. The PY estimates ranged from 174,232 metric tons (mt) using the CYP model to 238,088 mt using the maximum entropy model. The highest coefficient of determination ($R^2$), the lowest root mean square error (RMSE), and the lowest Theil's U statistic (U) for Korean west coast fisheries were obtained from the maximum entropy model. The maximum entropy model showed relatively better fits of data, indicating that the maximum entropy model is statistically more stable and accurate than other models. The estimate from the maximum entropy model is regarded as a more reasonable estimate of PY. The quality of input data should be improved for the future study of PY to obtain more reliable estimates.

Dual Generalized Maximum Entropy Estimation for Panel Data Regression Models

  • Lee, Jaejun;Cheon, Sooyoung
    • Communications for Statistical Applications and Methods
    • /
    • 제21권5호
    • /
    • pp.395-409
    • /
    • 2014
  • Data limited, partial, or incomplete are known as an ill-posed problem. If the data with ill-posed problems are analyzed by traditional statistical methods, the results obviously are not reliable and lead to erroneous interpretations. To overcome these problems, we propose a dual generalized maximum entropy (dual GME) estimator for panel data regression models based on an unconstrained dual Lagrange multiplier method. Monte Carlo simulations for panel data regression models with exogeneity, endogeneity, or/and collinearity show that the dual GME estimator outperforms several other estimators such as using least squares and instruments even in small samples. We believe that our dual GME procedure developed for the panel data regression framework will be useful to analyze ill-posed and endogenous data sets.

잉여생산량을 추정하는 모델과 파라미터 추정방법의 비교 (Comparison of models for estimating surplus productions and methods for estimating their parameters)

  • 권유정;장창익;표희동;서영일
    • 수산해양기술연구
    • /
    • 제49권1호
    • /
    • pp.18-28
    • /
    • 2013
  • It was compared the estimated parameters by the surplus production from three different models, i.e., three types (Schaefer, Gulland, and Schnute) of the traditional surplus production models, a stock production model incorporating covariates (ASPIC) model and a maximum entropy (ME) model. We also evaluated the performance of models in the estimation of their parameters. The maximum sustainable yield (MSY) of small yellow croaker (Pseudosciaena polyactis) in Korean waters ranged from 35,061 metric tons (mt) by Gulland model to 44,844mt by ME model, and fishing effort at MSY ($f_{MSY}$) ranged from 262,188hauls by Schnute model to 355,200hauls by ME model. The lowest root mean square error (RMSE) for small yellow croaker was obtained from the Gulland surplus production model, while the highest RMSE was from Schnute model. However, the highest coefficient of determination ($R^2$) was from the ME model, but the ASPIC model yielded the lowest coefficient. On the other hand, the MSY of Kapenta (Limnothrissa miodon) ranged from 16,880 mt by ASPIC model to 25,373mt by ME model, and $f_{MSY}$, from 94,580hauls by ASPIC model to 225,490hauls by Schnute model. In this case, both the lowest root mean square error (RMSE) and the highest coefficient of determination ($R^2$) were obtained from the ME model, which showed relatively better fits of data to the model, indicating that the ME model is statistically more stable and robust than other models. Moreover, the ME model could provide additional ecologically useful parameters such as, biomass at MSY ($B_{MSY}$), carrying capacity of the population (K), catchability coefficient (q) and the intrinsic rate of population growth (r).

카리브호수 카펜타 자원량 추정을 위한 최대엔트피모델과 분석적 모델의 비교분석 (A Comparative Analysis of Maximum Entropy and Analytical Models for Assessing Kapenta (Limnothrissa miodon) Stock in Lake Kariba)

  • 이타이 텐다우펜유;표희동
    • 자원ㆍ환경경제연구
    • /
    • 제26권4호
    • /
    • pp.613-639
    • /
    • 2017
  • 카리브호수의 카펜타 자원량을 추정하기 위해 최대엔트로피(ME)모델과 분석적 모델이 적용된다. ME모델을 이용하여 25,372톤의 최대지속가능 어획량(MSY)과 MSY의 어획노력량인 109,731의 어획일수(fishing nights)를 추정하였는데, 이는 현재 어획노력량 수준이 과잉투자됨으로써 1988년 이후 2009년 현재까지 자원량을 감소시키는 요인인 것을 나타낸다. 분석적 모델은 매년의 생물학적 허용 어획량(ABC)과 연간 1.21의 어획사망계수(일반적 어획사망계수인 0.927 보다 큰)를 추정한다. 이 두 모델은 1982년 기준년도의 자원량 추정에 적용할 수 있는 유사한 자원량을 추정한다. ME모델에 의하면 1988년의 최대 자원량(156,047톤)에 대해 1/3수준이하 까지 점점 하락하는 결과를 추정하였는데, 이는 최근의 어획량이 MSY 수준 이하이지만 ABC수준보다 높게 나타나 남획된 것을 암시한다. 다시 말해서, 분석적 모델은 ME모델에서의 MSY보다 더 보수적인 ABC를 제공함으로써, 보수적인 어업관리정책(총허용어획량제도, 어획노력감소정책 등)을 적극적으로 고려해야함을 내포하고 있다.

Application of Generalized Maximum Entropy Estimator to the Two-way Nested Error Component Model with III-Posed Data

  • Cheon, Soo-Young
    • Communications for Statistical Applications and Methods
    • /
    • 제16권4호
    • /
    • pp.659-667
    • /
    • 2009
  • Recently Song and Cheon (2006) and Cheon and Lim (2009) developed the generalized maximum entropy(GME) estimator to solve ill-posed problems for the regression coefficients in the simple panel model. The models discussed consider the individual and a spatial autoregressive disturbance effects. However, in many application in economics the data may contain nested groupings. This paper considers a two-way error component model with nested groupings for the ill-posed data and proposes the GME estimator of the unknown parameters. The performance of this estimator is compared with the existing methods on the simulated dataset. The results indicate that the GME method performs the best in estimating the unknown parameters in terms of its quality when the data are ill-posed.