• 제목/요약/키워드: Maximum Entropy

Search Result 335, Processing Time 0.021 seconds

Fine-Grained Named Entity Recognition using Conditional Random Fields for Question Answering (Conditional Random Fields를 이용한 세부 분류 개체명 인식)

  • Lee, Chang-Ki;Hwang, Yi-Gyu;Oh, Hyo-Jung;Lim, Soo-Jong;Heo, Jeong;Lee, Chung-Hee;Kim, Hyeon-Jin;Wang, Ji-Hyun;Jang, Myung-Gil
    • Annual Conference on Human and Language Technology
    • /
    • 2006.10e
    • /
    • pp.268-272
    • /
    • 2006
  • 질의응답 시스템은 사용자 질의에 해당하는 정답을 찾기 위해서 세부 분류된 개체명을 사용한다. 이러한 세부 분류 개체명 인식을 위해서 대부분의 시스템이 일반 대분류 개체명인식 후에 사전 등을 이용하여 세부 분류로 나누는 방법을 이용하고 있다. 본 논문에서는 질의응답 시스템을 위한 세부 분류 개체명 인식을 위해서 Conditional Random Fields를 이용한다. 개체명 인식의 과정을 개체명 경계 인식과 경계가 인식된 개체명의 클래스 분류의 두 단계로 나누어, 개체명 경계 인식에 Conditional Random Fields를 이용하고, 경계 인식된 개체명의 클래스 분류에는 Maximum Entropy를 이용한다. 실험결과 147개의 세부분류 개체명 인식에 대해서 정확도 85.8%, 재현률 81.1%. F1=83.4의 성능을 얻었고. baseline model 보다 학습 시간이 27%로 줄고 성능은 증가하였다. 또한 제안된 세부 분류개체명 인식기를 이용하여 질의응답 시스템에 적용한 결과 26%의 성능향상을 보였다.

  • PDF

Probabilistic Models for Local Patterns Analysis

  • Salim, Khiat;Hafida, Belbachir;Ahmed, Rahal Sid
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.145-161
    • /
    • 2014
  • Recently, many large organizations have multiple data sources (MDS') distributed over different branches of an interstate company. Local patterns analysis has become an effective strategy for MDS mining in national and international organizations. It consists of mining different datasets in order to obtain frequent patterns, which are forwarded to a centralized place for global pattern analysis. Various synthesizing models [2,3,4,5,6,7,8,26] have been proposed to build global patterns from the forwarded patterns. It is desired that the synthesized rules from such forwarded patterns must closely match with the mono-mining results (i.e., the results that would be obtained if all of the databases are put together and mining has been done). When the pattern is present in the site, but fails to satisfy the minimum support threshold value, it is not allowed to take part in the pattern synthesizing process. Therefore, this process can lose some interesting patterns, which can help the decider to make the right decision. In such situations we propose the application of a probabilistic model in the synthesizing process. An adequate choice for a probabilistic model can improve the quality of patterns that have been discovered. In this paper, we perform a comprehensive study on various probabilistic models that can be applied in the synthesizing process and we choose and improve one of them that works to ameliorate the synthesizing results. Finally, some experiments are presented in public database in order to improve the efficiency of our proposed synthesizing method.

Habitat Potential Evaluation Using Maxent Model - Focused on Riparian Distance, Stream Order and Land Use - (Maxent 모형을 이용한 서식지 잠재력 평가 - 하천으로부터의 거리, 하천의 차수, 토지이용을 중심으로-)

  • Lee, Dong-Kun;Kim, Ho-Gul
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.13 no.6
    • /
    • pp.161-172
    • /
    • 2010
  • As the interest on biodiversity has increased around the world, researches about evaluating potential for habitat are also increasing to find and comprehend the valuable habitats. This study focus on comprehending the significance of stream in evaluating habitat's potential. The purpose of this study is to evaluate habitat potential with applying stream as a main variable, and to comprehend the relationship between the variables and habitat potential. Basin is a unit that has hydrological properties and dynamic interaction with ecosystem. Especially, biodiversity and suitability of habitat in basin area has direct correlation with stream. Existing studies also are proposing for habitat potential evaluation in basin unit, they applied forest, slope and road as main variables. Despite stream is considered the most important factor in basin area, researchers haven't applied stream as a main variable. Therefore, in this study, three variables that can demonstrate hydrological properties are selected, which are, riparian distance, stream order and land use disturbance, and evaluate habitat potential. Habitat potential is analyzed by using Maxent (Maximum entropy model), and vertebrate's presence data is used as dependent variables and stream order map and land cover map is used as base data of independent variables. As a result of analysis, habitat potential is higher at riparian and upstream area, and lower at frequently disturbed area. Result indicates that adjacent to stream, upstream, and less disturbed area is the habitat that vertebrate prefer. In particular, mammals prefer adjacent area of stream and forest and reptiles prefer upriver area. Birds prefer adjacent area of stream and midstream and amphibians prefer adjacent area of stream and upriver. The result of this research could help to establish habitat conservation strategy around basin unit in the future.

Uncertainty analysis of quantitative rainfall estimation based on weather radars (기상레이더 기반 정량적 강수추정에서의 불확실성 분석)

  • Lee, Jae-Kyoung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.23-23
    • /
    • 2017
  • 기상레이더는 강우량을 바로 추정하지 못하는 특성으로 인해 정량적 강우산출 과정 중에 다양한 원인으로 인해 불확실성 발생 요소가 존재하나 이를 정량화하고 저감하는데 많은 어려움이 있다. 원인을 살펴보면, 첫째, 기상레이더의 관측에서부터 정량적 강우량 추정까지 일련의 과정에 대한 포괄적으로 불확실성 정량화와 분석이 이루어지지 못하며, 둘째, 전체 불확실성이 어느 정도 되는지 제시하지 못하므로 각 단계별 불확실성이 전체 불확실성 대비 어느 정도 비율이 되는지 제시하지 못한다. 마지막으로 기존 연구들은 불확실성을 줄이고자 여러 방법을 사용하고 있으나 어느 정도 효용성이 있는지 불확실성 측면에서 제시하지 못하고 있다. 따라서 본 연구에서는 Maximum Entropy(ME)와 Uncertainty Delta Method(UMD)를 이용한 접근방법을 제안하여 기상레이더를 활용하여 정량적 강우량을 추정하는 일련의 과정에서 단계별로 불확실성이 어떻게 전파되는지 추정하였다. 본 연구에서는 한반도 전역을 대상으로 2012년 여름철(6~8월)에 발생한 18개 강우사례를 이용하여 품질관리(Open Radar Product Generator 품질관리 알고리즘, fuzzy 알고리즘), 강우추정(Window Probability Matching Method, Marshall-Palmer 관계식), 후처리보정(Local Gauge Correction 기법, Gauge to Radar ratio 기법)단계만을 수행하였으며, 이 결과를 바탕으로 기상레이더 정량적 강우추정 단계별 불확실성을 정량화하였다. 정량화결과, 최종적으로 관측단계의 불확실성보다 최종 불확실성이 줄어들었으나, 강우추정 단계에서 불확실성이 증가하는 것으로 나타났다. 이는 어떤 강우추정식을 적용하느냐에 따라 레이더 강우추정결과가 매우 달라질 수 있음을 의미한다. 따라서 본 연구에서 제시한 불확실성 정량화 방법을 통하여 첫째, 전체 및 단계별 불확실성을 정량화할 수 있고, 둘째, 최종 불확실성 대비 각 단계별 불확실성을 비율을 제시할 수 있으며, 마지막으로 수행단계별로 불확실성 전파과정을 파악할 수 있다. 이는 향후 정량적 레이더 강우추정 과정에 있어서 불확실성을 발생시키는 주요 원인파악과 이에 대한 집중적인 투자를 가능하게 한다. 이러한 과정을 통하여 보다 정확한 정량적 레이더 강우추정이 가능할 것으로 판단된다.

  • PDF

Predicting the Suitable Habitat of Invasive Alien Plant Conyza bonariensis based on Climate Change Scenarios (기후변화 시나리오에 의한 외래식물 실망초(Conyza bonariensis)의 서식지 분포 예측)

  • Lee, Yong-Ho;Oh, Young-Ju;Hong, Sun-Hea;Na, Chea-Sun;Na, Young-Eun;Kim, Chang-Suk;Sohn, Soo-In
    • Journal of Climate Change Research
    • /
    • v.6 no.3
    • /
    • pp.243-248
    • /
    • 2015
  • This study was conducted to predict the changes of potential distribution for invasive alien plant, Conyza bonariensis in Korea. C. bonariensis was found in southern Korea (Jeju, south coast, southwest coast). The habitats of C. bonariensis were roadside, bare ground, farm area, and pasture, where the interference by human was severe. Due to the seed characteristics of Compositae, C. bonariensis take long scattering distance and it will easily spread by movement of wind, vehicles and people. C. canadensis in same Conyza genus has already spread on a national scale and it is difficult to manage. We used maximum entropy modeling (MaxEnt) for analyzing the environmental influences on C. bonariensis distribution and projecting on two different RCP scenarios, RCP 4.5 and RCP 8.5. The results of our study indicated annual mean temperature, elevation and temperature seasonality had higher contribution for C. bonariensis potential distribution. Area under curve (AUC) values of the model was 0.9. Under future climate scenario, the constructed model predicted that potential distribution of C. bonariensis will be increased by 338% on RCP 4.5 and 769% on RCP 8.5 in 2100s.

Potential impact of climate change on the species richness of subalpine plant species in the mountain national parks of South Korea

  • Adhikari, Pradeep;Shin, Man-Seok;Jeon, Ja-Young;Kim, Hyun Woo;Hong, Seungbum;Seo, Changwan
    • Journal of Ecology and Environment
    • /
    • v.42 no.4
    • /
    • pp.298-307
    • /
    • 2018
  • Background: Subalpine ecosystems at high altitudes and latitudes are particularly sensitive to climate change. In South Korea, the prediction of the species richness of subalpine plant species under future climate change is not well studied. Thus, this study aims to assess the potential impact of climate change on species richness of subalpine plant species (14 species) in the 17 mountain national parks (MNPs) of South Korea under climate change scenarios' representative concentration pathways (RCP) 4.5 and RCP 8.5 using maximum entropy (MaxEnt) and Migclim for the years 2050 and 2070. Results: Altogether, 723 species occurrence points of 14 species and six selected variables were used in modeling. The models developed for all species showed excellent performance (AUC > 0.89 and TSS > 0.70). The results predicted a significant loss of species richness in all MNPs. Under RCP 4.5, the range of reduction was predicted to be 15.38-94.02% by 2050 and 21.42-96.64% by 2070. Similarly, under RCP 8.5, it will decline 15.38-97.9% by 2050 and 23.07-100% by 2070. The reduction was relatively high in the MNPs located in the central regions (Songnisan and Gyeryongsan), eastern region (Juwangsan), and southern regions (Mudeungsan, Wolchulsan, Hallasan, and Jirisan) compared to the northern and northeastern regions (Odaesan, Seoraksan, Chiaksan, and Taebaeksan). Conclusions: This result indicates that the MNPs at low altitudes and latitudes have a large effect on the climate change in subalpine plant species. This study suggested that subalpine species are highly threatened due to climate change and that immediate actions are required to conserve subalpine species and to minimize the effect of climate change.

Application of Species Distribution Model for Predicting Areas at Risk of Highly Pathogenic Avian Influenza in the Republic of Korea (종 분포 모형을 이용한 국내 고병원성 조류인플루엔자 발생 위험지역 추정)

  • Kim, Euttm;Pak, Son-Il
    • Journal of Veterinary Clinics
    • /
    • v.36 no.1
    • /
    • pp.23-29
    • /
    • 2019
  • While research findings suggest that the highly pathogenic avian influenza (HPAI) is the leading cause of economic loss in Korean poultry industry with an estimated cumulative impact of $909 million since 2003, identifying the environmental and anthropogenic risk factors involved remains a challenge. The objective of this study was to identify areas at high risk for potential HPAI outbreaks according to the likelihood of HPAI virus detection in wild birds. This study integrates spatial information regarding HPAI surveillance with relevant demographic and environmental factors collected between 2003 and 2018. The Maximum Entropy (Maxent) species distribution modeling with presence-only data was used to model the spatial risk of HPAI virus. We used historical data on HPAI occurrence in wild birds during the period 2003-2018, collected by the National Quarantine Inspection Agency of Korea. The database contains a total of 1,065 HPAI cases (farms) tied to 168 unique locations for wild birds. Among the environmental variables, the most effective predictors of the potential distribution of HPAI in wild birds were (in order of importance) altitude, number of HPAI outbreaks at farm-level, daily amount of manure processed and number of wild birds migrated into Korea. The area under the receiver operating characteristic curve for the 10 Maxent replicate runs of the model with twelve variables was 0.855 with a standard deviation of 0.012 which indicates that the model performance was excellent. Results revealed that geographic area at risk of HPAI is heterogeneously distributed throughout the country with higher likelihood in the west and coastal areas. The results may help biosecurity authority to design risk-based surveillance and implementation of control interventions optimized for the areas at highest risk of HPAI outbreak potentials.

Thermal stability, magnetic and magnetocaloric properties of Gd55Co35M10 (M = Si, Zr and Nb) melt-spun ribbons

  • Jiao, D.L.;Zhong, X.C.;Zhang, H.;Qiu, W.Q.;Liu, Z.W.;Ramanujan, R.V.
    • Current Applied Physics
    • /
    • v.18 no.12
    • /
    • pp.1523-1527
    • /
    • 2018
  • The thermal stability, magnetic and magnetocaloric properties of $Gd_{55}Co_{35}M_{10}$ (M = Si, Zr and Nb) melts-pun ribbons were studied. The relatively high reduced glass transition temperature ($T_{x1}/T_m$ > 0.60) and low melting point ($T_m$) resulted in excellent glass forming ability (GFA). The Curie temperatures ($T_C$) of melt-spun amorphous ribbons $Gd_{55}Co_{35}M_{10}$ for M = Si, Zr and Nb were 166, 148 and 173 K, respectively. For a magnetic field change of 2 T, the values of maximum magnetic entropy change $(-{\Delta}S_M)^{max}$ for $Gd_{55}Co_{35}Si_{10}$, $Gd_{55}Co_{35}Zr_{10}$ and $Gd_{55}Co_{35}Nb_{10}$ were found to be 2.86, 4.28 and $4.05J\;kg^{-1}K^{-1}$, while the refrigeration capacity (RC) values were 154, 274 and $174J\;kg^{-1}$, respectively. The $RC_{FWHM}$ values of amorphous alloys $Gd_{55}Co_{35}M_{10}$ (M = Si, Zr and Nb) are comparable to or larger than that of $LaFe_{11.6}Si_{1.4}$ crystalline alloy. Large values of $(-{\Delta}S_M)^{max}$ and RC along with good thermal stability make $Gd_{55}Co_{35}M_{10}$ (M = Si, Zr and Nb) amorphous alloys be potential materials for magnetic cooling operating in a wide temperature range from 150 to 175 K, e.g., as part of a gas liquefaction process.

Magnetocaloric Properties of AlFe2B2 Including Paramagnetic Impurities of Al13Fe4

  • Lee, J.W.;Song, M.S.;Cho, K.K.;Cho, B.K.;Nam, Chunghee
    • Journal of the Korean Physical Society
    • /
    • v.73 no.10
    • /
    • pp.1555-1560
    • /
    • 2018
  • $AlFe_2B_2$ produced by using a conventional arc melter has a ferromagnetic material with a Curie temperature ($T_C$) of around 300 K, but the arc-melt generates paramagnetic $Al_{13}Fe_4$ impurities during the synthesis of $AlFe_2B_2$. Impurities are brought to cause a decrease in magnetocaloric effects (MCEs). To investigate the effects of $Al_{13}Fe_4$ impurities on MCEs, we prepared and compared ascast and acid-treated samples, where the acid treatment was performed to remove the $Al_{13}Fe_4$ impurities. For the structural analysis, powder X-ray diffraction was carried out, and the measured data were subjected to a Rietveld refinement. The presence of $Al_{13}Fe_4$ impurities in the as-cast sample was observed in the phase analysis measurements. Magnetic properties were investigated by using Superconducting Quantum Interference Device (SQUID) measurements for the as-cast and the acid-treated $AlFe_2B_2$ samples. From isothermal magnetization measurements, Arrott plots were obtained showing that the transition of $AlFe_2B_2$ has a second-order magnetic phase transition (SOMT). The $T_C$ and the saturation magnetization increased for the acid-treated sample due to removal of the paramagnetic impurities. As a consequence, the magnetic entropy change ($-{\Delta}S$) increased in the pure $AlFe_2B_2$ samples, but the full width at half maximum in the plot of $-{\Delta}S$ vs. T decreased due to the absence of impurities.

Modeling Species Distributions to Predict Seasonal Habitat Range of Invasive Fish in the Urban Stream via Environmental DNA

  • Kang, Yujin;Shin, Wonhyeop;Yun, Jiweon;Kim, Yonghwan;Song, Youngkeun
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.3 no.1
    • /
    • pp.54-65
    • /
    • 2022
  • Species distribution models are a useful tool for predicting future distribution and establishing a preemptive response of invasive species. However, few studies considered the possibility of habitat for the aquatic organism and the number of target sites was relatively small compared to the area. Environmental DNA (eDNA) is the emerging tool as the methodology obtaining the bulk of species presence data with high detectability. Thus, this study applied eDNA survey results of Micropterus salmoides and Lepomis macrochirus to species distribution modeling by seasons in the Anyang stream network. Maximum Entropy (MaxEnt) model evaluated that both species extended potential distribution area in October compared to July from 89.1% (12,110,675 m2) to 99.3% (13,625,525 m2) for M. salmoides and 76.6% (10,407,350 m2) to 100% (13,724,225 m2) for L. macrochirus. The prediction value by streams was varied according to species and seasons. Also, models elucidate the significant environmental variables which affect the distribution by seasons and species. Our results identified the potential of eDNA methodology as a way to retrieve species data effectively and use data for building a model.