• Title/Summary/Keyword: bayesian classification

Search Result 254, Processing Time 0.032 seconds

Evaluation of Future Hydrologic Risk of Drought in Nakdong River Basin Using Bayesian Classification-Based Composite Drought Index (베이지안 분류 기반 통합가뭄지수를 활용한 낙동강 유역의 미래 가뭄에 대한 수문학적 위험도 분석)

  • Kim, Hyeok;Kim, Ji Eun;Kim, Jiyoung;Yoo, Jiyoung;Kim, Tae-Woong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.3
    • /
    • pp.309-319
    • /
    • 2023
  • Recently, the frequency and intensity of meteorological disasters have increased due to climate change. In South Korea, there are regional differences in vulnerability and response capability to cope with climate change because of regional climate characteristics. In particular, drought results from various factors and is linked to extensive meteorological, hydrological, and agricultural impacts. Therefore, in order to effectively cope with drought, it is necessary to use a composite drought index that can take into account various factors, and to evaluate future droughts comprehensively considering climate change. This study evaluated hydrologic risk(${\bar{R}}$) of future drought in the Nakdong River basin based on the Dynamic Naive Bayesian Classification (DNBC)-based composite drought index, which was calculated by applying Standardized Precipitation Index (SPI), Streamflow Drought Index (SDI), Evaporate Stress Index (ESI) and Water Supply Capacity Index (WSCI) to the DNBC. The indices used in the DNBC were calculated using observation data and climate scenario data. A bivariate frequency analysis was performed for the severity and duration of the composite drought. Then using the estimated bivariate return periods, hydrologic risks of drought were calculated for observation and future periods. The overall results indicated that there were the highest risks during the future period (2021-2040) (${\bar{R}}$=0.572), and Miryang River (#2021) had the highest risk (${\bar{R}}$=0.940) on average. The hydrologic risk of the Nakdong River basin will increase highly in the near future (2021-2040). During the far future (2041-2099), the hydrologic risk decreased in the northern basins, and increased in the southern basins.

Improvement of Classification Rate of Handwritten Digits by Combining Multiple Dynamic Topology-Preserving Self-Organizing Maps (다중 동적 위상보존 자기구성 지도의 결합을 통한 필기숫자 데이타의 분류율 향상)

  • Kim, Hyun-Don;Cho, Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.12
    • /
    • pp.875-884
    • /
    • 2001
  • Although the self organizing map (SOM) is widely utilized in such fields of data visualization and topology preserving mapping, since it should have the topology fixed before trained, it has some shortcomings that it is difficult to apply it to practical problems, and classification capability is quite low despite better clustering performance. To overcome these points this paper proposes the dynamic topology preserving self-organizing map(DTSOM) that dynamically splits the output nodes on the map and trains them, and attempts to improve the classification capability by combining multiple DTSOMs K-Winner method has been applied to combine DTSOMs which produces K outputs with winner node selection method. This produces even better performance than the conventional combining methods such as majority voting weighting, BKS Bayesian, Borda, Condorect and reliability sum. DTSOM remedies the shortcoming of determining the topology in advance, and the classification rate increases significantly by combing multiple maps trained with different features. Experimental results with handwritten digit recognition indicate that the proposed method works out to problems of conventional SOM effectively so to improve the classification rate to 98.1%.

  • PDF

An N-version Learning Approach to Enhance the Prediction Accuracy of Classification Systems in Genetics-based Learning Environments (유전학 기반 학습 환경하에서 분류 시스템의 성능 향상을 위한 엔-버전 학습법)

  • Kim, Yeong-Jun;Hong, Cheol-Ui
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.7
    • /
    • pp.1841-1848
    • /
    • 1999
  • DELVAUX is a genetics-based inductive learning system that learns a rule-set, which consists of Bayesian classification rules, from sets of examples for classification tasks. One problem that DELVAUX faces in the rule-set learning process is that, occasionally, the learning process ends with a local optimum without finding the best rule-set. Another problem is that, occasionally, the learning process ends with a rule-set that performs well for the training examples but not for the unknown testing examples. This paper describes efforts to alleviate these two problems centering on the N-version learning approach, in which multiple rule-sets are learning and a classification system is constructed with those learned rule-sets to improve the overall performance of a classification system. For the implementation of the N-version learning approach, we propose a decision-making scheme that can draw a decision using multiple rule-sets and a genetic algorithm approach to find a good combination of rule-sets from a set of learned rule-sets. We also present empirical results that evaluate the effect of the N-version learning approach in the DELVAUX learning environment.

  • PDF

Review of Land Cover Classification Potential in River Spaces Using Satellite Imagery and Deep Learning-Based Image Training Method (딥 러닝 기반 이미지 트레이닝을 활용한 하천 공간 내 피복 분류 가능성 검토)

  • Woochul, Kang;Eun-kyung, Jang
    • Ecology and Resilient Infrastructure
    • /
    • v.9 no.4
    • /
    • pp.218-227
    • /
    • 2022
  • This study attempted classification through deep learning-based image training for land cover classification in river spaces which is one of the important data for efficient river management. For this purpose, land cover classification analysis with the RGB image of the target section based on the category classification index of major land cover map was conducted by using the learning outcomes from the result of labeling. In addition, land cover classification of the river spaces was performed by unsupervised and supervised classification from Sentinel-2 satellite images provided in an open format, and this was compared with the results of deep learning-based image classification. As a result of the analysis, it showed more accurate prediction results compared to unsupervised classification results, and it presented significantly improved classification results in the case of high-resolution images. The result of this study showed the possibility of classifying water areas and wetlands in the river spaces, and if additional research is performed in the future, the deep learning based image train method for the land cover classification could be used for river management.

A proper folder recommendation technique using frequent itemsets for efficient e-mail classification (효과적인 이메일 분류를 위한 빈발 항목집합 기반 최적 이메일 폴더 추천 기법)

  • Moon, Jong-Pil;Lee, Won-Suk;Chang, Joong-Hyuk
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.2
    • /
    • pp.33-46
    • /
    • 2011
  • Since an e-mail has been an important mean of communication and information sharing, there have been much effort to classify e-mails efficiently by their contents. An e-mail has various forms in length and style, and words used in an e-mail are usually irregular. In addition, the criteria of an e-mail classification are subjective. As a result, it is quite difficult for the conventional text classification technique to be adapted to an e-mail classification efficiently. An e-mail classification technique in a commercial e-mail program uses a simple text filtering technique in an e-mail client. In the previous studies on automatic classification of an e-mail, the Naive Bayesian technique based on the probability has been used to improve the classification accuracy, and most of them are on an e-mail in English. This paper proposes the personalized recommendation technique of an email in Korean using a data mining technique of frequent patterns. The proposed technique consists of two phases such as the pre-processing of e-mails in an e-mail folder and the generating a profile for the e-mail folder. The generated profile is used for an e-mail to be classified into the most appropriate e-mail folder by the subjective criteria. The e-mail classification system is also implemented, which adapts the proposed technique.

Utilizing Visual Information for Non-contact Predicting Method of Friction Coefficient (마찰계수의 비접촉 추정을 위한 영상정보 활용방법)

  • Kim, Doo-Gyu;Kim, Ja-Young;Lee, Ji-Hong;Choi, Dong-Geol;Kweon, In-So
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.4
    • /
    • pp.28-34
    • /
    • 2010
  • In this paper, we proposed an algorithm for utilizing visual information for non-contact predicting method of friction coefficient. Coefficient of friction is very important in driving on road and traversing over obstacle. Our algorithm is based on terrain classification for visual image. The proposed method, non-contacting approach, has advantage over other methods that extract material characteristic of road by sensors contacting road surface. This method is composed of learning group(experiment, grouping material) and predicting friction coefficient group(Bayesian classification prediction function). Every group include previous work of vision. Advantage of our algorithm before entering such terrain can be very useful for avoiding slippery areas. We make experiment on measurement of friction coefficient of terrain. This result is utilized real friction coefficient as prediction method. We show error between real friction coefficient and predicted friction coefficient for performance evaluation of our algorithm.

Design and Implementation of Web Mail Filtering Agent for Personalized Classification (개인화된 분류를 위한 웹 메일 필터링 에이전트)

  • Jeong, Ok-Ran;Cho, Dong-Sub
    • The KIPS Transactions:PartB
    • /
    • v.10B no.7
    • /
    • pp.853-862
    • /
    • 2003
  • Many more use e-mail purely on a personal basis and the pool of e-mail users is growing daily. Also, the amount of mails, which are transmitted in electronic commerce, is getting more and more. Because of its convenience, a mass of spam mails is flooding everyday. And yet automated techniques for learning to filter e-mail have yet to significantly affect the e-mail market. This paper suggests Web Mail Filtering Agent for Personalized Classification, which automatically manages mails adjusting to the user. It is based on web mail, which can be logged in any time, any place and has no limitation in any system. In case new mails are received, it first makes some personal rules in use of the result of observation ; and based on the personal rules, it automatically classifies the mails into categories according to the contents of mails and saves the classified mails in the relevant folders or deletes the unnecessary mails and spam mails. And, we applied Bayesian Algorithm using Dynamic Threshold for our system's accuracy.

Improving the Performance of Machine Learning Models for Anomaly Detection based on Vibration Analog Signals (진동 아날로그 신호 기반의 이상상황 탐지를 위한 기계학습 모형의 성능지표 향상)

  • Jaehun Kim;Sangcheon Eom;Chulsoon Park
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.47 no.2
    • /
    • pp.1-9
    • /
    • 2024
  • New motor development requires high-speed load testing using dynamo equipment to calculate the efficiency of the motor. Abnormal noise and vibration may occur in the test equipment rotating at high speed due to misalignment of the connecting shaft or looseness of the fixation, which may lead to safety accidents. In this study, three single-axis vibration sensors for X, Y, and Z axes were attached on the surface of the test motor to measure the vibration value of vibration. Analog data collected from these sensors was used in classification models for anomaly detection. Since the classification accuracy was around only 93%, commonly used hyperparameter optimization techniques such as Grid search, Random search, and Bayesian Optimization were applied to increase accuracy. In addition, Response Surface Method based on Design of Experiment was also used for hyperparameter optimization. However, it was found that there were limits to improving accuracy with these methods. The reason is that the sampling data from an analog signal does not reflect the patterns hidden in the signal. Therefore, in order to find pattern information of the sampling data, we obtained descriptive statistics such as mean, variance, skewness, kurtosis, and percentiles of the analog data, and applied them to the classification models. Classification models using descriptive statistics showed excellent performance improvement. The developed model can be used as a monitoring system that detects abnormal conditions of the motor test.

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.1
    • /
    • pp.47-60
    • /
    • 2010
  • Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.

  • PDF

Consideration for Setting Reference Range for Adrenocorticotropic Hormone Test according to Blood Collection Time (채혈 시간에 따른 부신피질 자극 호르몬 검사의 참고치 설정에 관한 고찰)

  • Ji-Hye Park;Jin-Ju Choi;Soo-Yeon Lim;Seon-Hee Yoo;Sun-Ho Lee
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.27 no.1
    • /
    • pp.42-46
    • /
    • 2023
  • Purpose The reference range described in Adrenocorticotropic Hormone reagent used in our laboratory is 10-60 pg/mL at 8 a.m. to 10 a.m., and 6-30 pg/mL at 8 p.m. to 10 p.m. However, in the case of outpatients, blood is mainly collected between 10 a.m. and 6 p.m., accounting for 57.8% of the total. Therefore, This study is intended to help make a more accurate diagnosis by reevaluating the reference range provided by the manufacturer of the Adrenocorticotropic Hormone reagent and setting split-timed reference range. Materials and Methods The patients collected blood before 10 a.m. were group A (68 people), and the patients collected blood after 10 a.m. were set to group B (80 people). A T-test was performed between groups to test their significance. And it was confirmed whether it was necessary to set the gender classification as a subgroup. The method of setting the reference range was calculated by the Bayesian's method and the Hoffmann's method. Results The reference range of Group A was 8.6 to 60.6 pg/mL by the Bayesian's method, and the Hoffmann's method was 3.6 to 61.3 pg/mL. The reference range of Group B was 6.9 to 50.5 pg/mL when applying the Bayesian's method, and the Hoffmann method's was 2.3 to 48.9 pg/mL. Conclusion This study was concluded that it was necessary to set the split-timed reference range. Through this study, the later the blood collection time, the lower the level of Adrenocorticotropic Hormone, indicating that blood collection time is important for patients with clinical significance. If a large number of subjects are selected and supplemented in the future, it is believed that systematic and accurate reference range can be set.

  • PDF