• Title/Summary/Keyword: Bayes method

Search Result 369, Processing Time 0.024 seconds

Development of the Bayesian method and its application to the water resources field (베이지안 기법의 발전 및 수자원 분야에의 적용)

  • Na, Wooyoung;Yoo, Chulsang
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.1
    • /
    • pp.1-13
    • /
    • 2021
  • The Bayesian method is a very useful statistical tool in various fields including water resources. Therefore, in this study, the background of the Bayesian statistics and its application to the water resources field are reviewed. First, the history of the Bayesian method from the birth to the present, and the achievements of Bayesian statisticians are summarized. Next, the derivation of the Bayes' theorem, which is the basis of the Bayesian method, is presented, and the roles of the three elements of the Bayes' theorem: priori distribution, likelihood function, and posteriori distribution are explained. In addition, the unique features and advantages of the Bayesian statistics are summarized. Finally, the cases in water resources where the Bayesian method is applied are summarized by dividing them into several categories. With a prevalence of information and big data in the future, the Bayesian method is expected to be used more actively in the water resources field.

Study on Anomaly Detection Method of Improper Foods using Import Food Big data (수입식품 빅데이터를 이용한 부적합식품 탐지 시스템에 관한 연구)

  • Cho, Sanggoo;Choi, Gyunghyun
    • The Journal of Bigdata
    • /
    • v.3 no.2
    • /
    • pp.19-33
    • /
    • 2018
  • Owing to the increase of FTA, food trade, and versatile preferences of consumers, food import has increased at tremendous rate every year. While the inspection check of imported food accounts for about 20% of the total food import, the budget and manpower necessary for the government's import inspection control is reaching its limit. The sudden import food accidents can cause enormous social and economic losses. Therefore, predictive system to forecast the compliance of food import with its preemptive measures will greatly improve the efficiency and effectiveness of import safety control management. There has already been a huge data accumulated from the past. The processed foods account for 75% of the total food import in the import food sector. The analysis of big data and the application of analytical techniques are also used to extract meaningful information from a large amount of data. Unfortunately, not many studies have been done regarding analyzing the import food and its implication with understanding the big data of food import. In this context, this study applied a variety of classification algorithms in the field of machine learning and suggested a data preprocessing method through the generation of new derivative variables to improve the accuracy of the model. In addition, the present study compared the performance of the predictive classification algorithms with the general base classifier. The Gaussian Naïve Bayes prediction model among various base classifiers showed the best performance to detect and predict the nonconformity of imported food. In the future, it is expected that the application of the abnormality detection model using the Gaussian Naïve Bayes. The predictive model will reduce the burdens of the inspection of import food and increase the non-conformity rate, which will have a great effect on the efficiency of the food import safety control and the speed of import customs clearance.

Development of Supervised Machine Learning based Catalog Entry Classification and Recommendation System (지도학습 머신러닝 기반 카테고리 목록 분류 및 추천 시스템 구현)

  • Lee, Hyung-Woo
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.57-65
    • /
    • 2019
  • In the case of Domeggook B2B online shopping malls, it has a market share of over 70% with more than 2 million members and 800,000 items are sold per one day. However, since the same or similar items are stored and registered in different catalog entries, it is difficult for the buyer to search for items, and problems are also encountered in managing B2B large shopping malls. Therefore, in this study, we developed a catalog entry auto classification and recommendation system for products by using semi-supervised machine learning method based on previous huge shopping mall purchase information. Specifically, when the seller enters the item registration information in the form of natural language, KoNLPy morphological analysis process is performed, and the Naïve Bayes classification method is applied to implement a system that automatically recommends the most suitable catalog information for the article. As a result, it was possible to improve both the search speed and total sales of shopping mall by building accuracy in catalog entry efficiently.

Classification Accuracy Improvement for Decision Tree (의사결정트리의 분류 정확도 향상)

  • Rezene, Mehari Marta;Park, Sanghyun
    • Annual Conference of KIPS
    • /
    • 2017.04a
    • /
    • pp.787-790
    • /
    • 2017
  • Data quality is the main issue in the classification problems; generally, the presence of noisy instances in the training dataset will not lead to robust classification performance. Such instances may cause the generated decision tree to suffer from over-fitting and its accuracy may decrease. Decision trees are useful, efficient, and commonly used for solving various real world classification problems in data mining. In this paper, we introduce a preprocessing technique to improve the classification accuracy rates of the C4.5 decision tree algorithm. In the proposed preprocessing method, we applied the naive Bayes classifier to remove the noisy instances from the training dataset. We applied our proposed method to a real e-commerce sales dataset to test the performance of the proposed algorithm against the existing C4.5 decision tree classifier. As the experimental results, the proposed method improved the classification accuracy by 8.5% and 14.32% using training dataset and 10-fold crossvalidation, respectively.

Multimedia Watermark Detection Algorithm Based on Bayes Decision Theorys

  • Kwon, Seong-Geun;Lee, Suk-Hwan;Kwon, Kee-Koo;Kwon, Ki-Ryong;Lee, Kuhn-Il
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.1272-1275
    • /
    • 2002
  • Watermark detection plays a crucial role in multimedia copyright protection and has traditionally been tackled using correlation-based algorithms. However, correlation-based detection is not actually the best choice, as it does not utilize the distributional characteristics of the image being marked. Accordingly, an efficient watermark detection scheme for DWT coefficients is proposed as optimal for non-additive schemes. Based on the statistical decision theory, the proposed method is derived according to Bayes' decision theory, the Neyman-Pearson criterion, and the distribution of the DWT coefficients, thereby minimizing the missed detection probability subject to a given false alarm probability. The proposed method was tested in the context of robustness, and the results confirmed the superiority of the proposed technique over conventional correlation-based detection method.

  • PDF

Sampling Based Approach to Bayesian Analysis of Binary Regression Model with Incomplete Data

  • Chung, Young-Shik
    • Journal of the Korean Statistical Society
    • /
    • v.26 no.4
    • /
    • pp.493-505
    • /
    • 1997
  • The analysis of binary data appears to many areas such as statistics, biometrics and econometrics. In many cases, data are often collected in which some observations are incomplete. Assume that the missing covariates are missing at random and the responses are completely observed. A method to Bayesian analysis of the binary regression model with incomplete data is presented. In particular, the desired marginal posterior moments of regression parameter are obtained using Meterpolis algorithm (Metropolis et al. 1953) within Gibbs sampler (Gelfand and Smith, 1990). Also, we compare logit model with probit model using Bayes factor which is approximated by importance sampling method. One example is presented.

  • PDF

Comparison of different estimators of P(Y

  • Hassan, Marwa KH.
    • International Journal of Reliability and Applications
    • /
    • v.18 no.2
    • /
    • pp.83-98
    • /
    • 2017
  • Stress-strength reliability problems arise frequently in applied statistics and related fields. In the context of reliability, the stress-strength model describes the life of a component, which has a random strength X and is subjected to random stress Y. The component fails at the instant that the stress applied to it exceeds the strength and the component will function satisfactorily whenever X > Y. The problem of estimation the reliability parameter in a stress-strength model R = P[Y < X], when X and Y are two independent two-parameter Lindley random variables is considered in this paper. The maximum likelihood estimator (MLE) and Bayes estimator of R are obtained. Also, different confidence intervals of R are obtained. Simulation study is performed to compare the different proposed estimation methods. Example in real data is used as practical application of the proposed procedure.

  • PDF

Improving Naïve Bayes Text Classifiers with Incremental Feature Weighting (점진적 특징 가중치 기법을 이용한 나이브 베이즈 문서분류기의 성능 개선)

  • Kim, Han-Joon;Chang, Jae-Young
    • The KIPS Transactions:PartB
    • /
    • v.15B no.5
    • /
    • pp.457-464
    • /
    • 2008
  • In the real-world operational environment, most of text classification systems have the problems of insufficient training documents and no prior knowledge of feature space. In this regard, $Na{\ddot{i}ve$ Bayes is known to be an appropriate algorithm of operational text classification since the classification model can be evolved easily by incrementally updating its pre-learned classification model and feature space. This paper proposes the improving technique of $Na{\ddot{i}ve$ Bayes classifier through feature weighting strategy. The basic idea is that parameter estimation of $Na{\ddot{i}ve$ Bayes considers the degree of feature importance as well as feature distribution. We can develop a more accurate classification model by incorporating feature weights into Naive Bayes learning algorithm, not performing a learning process with a reduced feature set. In addition, we have extended a conventional feature update algorithm for incremental feature weighting in a dynamic operational environment. To evaluate the proposed method, we perform the experiments using the various document collections, and show that the traditional $Na{\ddot{i}ve$ Bayes classifier can be significantly improved by the proposed technique.

Extraction of ObjectProperty-UsageMethod Relation from Web Documents

  • Pechsiri, Chaveevan;Phainoun, Sumran;Piriyakul, Rapeepun
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1103-1125
    • /
    • 2017
  • This paper aims to extract an ObjectProperty-UsageMethod relation, in particular the HerbalMedicinalProperty-UsageMethod relation of the herb-plant object, as a semantic relation between two related sets, a herbal-medicinal-property concept set and a usage-method concept set from several web documents. This HerbalMedicinalProperty-UsageMethod relation benefits people by providing an alternative treatment/solution knowledge to health problems. The research includes three main problems: how to determine EDU (where EDU is an elementary discourse unit or a simple sentence/clause) with a medicinal-property/usage-method concept; how to determine the usage-method boundary; and how to determine the HerbalMedicinalProperty-UsageMethod relation between the two related sets. We propose using N-Word-Co on the verb phrase with the medicinal-property/usage-method concept to solve the first and second problems where the N-Word-Co size is determined by the learning of maximum entropy, support vector machine, and naïve Bayes. We also apply naïve Bayes to solve the third problem of determining the HerbalMedicinalProperty-UsageMethod relation with N-Word-Co elements as features. The research results can provide high precision in the HerbalMedicinalProperty-UsageMethod relation extraction.

Bayesian Test of Quasi-Independence in a Sparse Two-Way Contingency Table

  • Kwak, Sang-Gyu;Kim, Dal-Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.495-500
    • /
    • 2012
  • We consider a Bayesian test of independence in a two-way contingency table that has some zero cells. To do this, we take a three-stage hierarchical Bayesian model under each hypothesis. For prior, we use Dirichlet density to model the marginal cell and each cell probabilities. Our method does not require complicated computation such as a Metropolis-Hastings algorithm to draw samples from each posterior density of parameters. We draw samples using a Gibbs sampler with a grid method. For complicated posterior formulas, we apply the Monte-Carlo integration and the sampling important resampling algorithm. We compare the values of the Bayes factor with the results of a chi-square test and the likelihood ratio test.