• Title/Summary/Keyword: statistical methods

Search Result 11,481, Processing Time 0.039 seconds

통계적 추론에 있어서 베이지안과 고전적 방법(신뢰성 분석과 관련하여)

  • 박태룡
    • Journal for History of Mathematics
    • /
    • v.11 no.1
    • /
    • pp.68-77
    • /
    • 1998
  • There are two approach methods widely in statistical inferences. First is sampling theory methods and the other is Bayesian methods. In this paper, we will introduce the most basic differences of the two approach methods. Especially, we investigate and introduce the historical origin of Bayesian methods in Statistical inferences which is currently used. Also, we introduce the some characteristics of sampling theory method and Bayesian methods.

  • PDF

Statistical Estimation of Modal Characteristics of a Structural System Based on Design Variable Samples (설계변수 표본에 근거한 구조시스템 모달 특성의 통계적 예측)

  • Kim, Yong-Woo;Yoo, Hong-Hee
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.33 no.11
    • /
    • pp.1314-1319
    • /
    • 2009
  • The design methods of mechanical systems are largely classified into deterministic methods and stochastic methods. In deterministic methods, design parameters are assumed to have fixed values. On the other hand, in stochastic methods, design parameters are assumed to be statistically distributed. When a stochastic method is employed, statistical characteristics of the populations of design variables are assumed to be known. However, very often, it is almost impossible or very expensive to obtain the statistical characteristics of the populations. Therefore a sample survey method is usually employed for stochastic methods. This paper describes the procedure of estimating the statistical characteristics of populations by employing sample data sets. An example of AFM micro cantilever beam is employed to show the effectiveness of the procedure.

Fault Prediction Using Statistical and Machine Learning Methods for Improving Software Quality

  • Malhotra, Ruchika;Jain, Ankita
    • Journal of Information Processing Systems
    • /
    • v.8 no.2
    • /
    • pp.241-262
    • /
    • 2012
  • An understanding of quality attributes is relevant for the software organization to deliver high software reliability. An empirical assessment of metrics to predict the quality attributes is essential in order to gain insight about the quality of software in the early phases of software development and to ensure corrective actions. In this paper, we predict a model to estimate fault proneness using Object Oriented CK metrics and QMOOD metrics. We apply one statistical method and six machine learning methods to predict the models. The proposed models are validated using dataset collected from Open Source software. The results are analyzed using Area Under the Curve (AUC) obtained from Receiver Operating Characteristics (ROC) analysis. The results show that the model predicted using the random forest and bagging methods outperformed all the other models. Hence, based on these results it is reasonable to claim that quality models have a significant relevance with Object Oriented metrics and that machine learning methods have a comparable performance with statistical methods.

Bayesian Methods for Generalized Linear Models

  • Paul E. Green;Kim, Dae-Hak
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.2
    • /
    • pp.523-532
    • /
    • 1999
  • Generalized linear models have various applications for data arising from many kinds of statistical studies. Although the response variable is generally assumed to be generated from a wide class of probability distributions we focus on count data that are most often analyzed using binomial models for proportions or poisson models for rates. The methods and results presented here also apply to many other categorical data models in general due to the relationship between multinomial and poisson sampling. The novelty of the approach suggested here is that all conditional distribution s can be specified directly so that staraightforward Gibbs sampling is possible. The prior distribution consists of two stages. We rely on a normal nonconjugate prior at the first stage and a vague prior for hyperparameters at the second stage. The methods are demonstrated with an illustrative example using data collected by Rosenkranz and raftery(1994) concerning the number of hospital admissions due to back pain in Washington state.

  • PDF

Application of data mining and statistical measurement of agricultural high-quality development

  • Yan Zhou
    • Advances in nano research
    • /
    • v.14 no.3
    • /
    • pp.225-234
    • /
    • 2023
  • In this study, we aim to use big data resources and statistical analysis to obtain a reliable instruction to reach high-quality and high yield agricultural yields. In this regard, soil type data, raining and temperature data as well as wheat production in each year are collected for a specific region. Using statistical methodology, the acquired data was cleaned to remove incomplete and defective data. Afterwards, using several classification methods in machine learning we tried to distinguish between different factors and their influence on the final crop yields. Comparing the proposed models' prediction using statistical quantities correlation factor and mean squared error between predicted values of the crop yield and actual values the efficacy of machine learning methods is discussed. The results of the analysis show high accuracy of machine learning methods in the prediction of the crop yields. Moreover, it is indicated that the random forest (RF) classification approach provides best results among other classification methods utilized in this study.

A Review of the Statistical Analysis used in Clinical Articles Published on Journal of Korean Neurosurgical Society

  • Kang, Wee-Chang
    • Journal of Korean Neurosurgical Society
    • /
    • v.40 no.4
    • /
    • pp.304-308
    • /
    • 2006
  • Statistical analyses used in clinical articles published on the Journal of Korean Neurosurgical Society were identified and appropriateness of statistical aspects in reporting results was assessed. Forty seven clinical articles were selected in this study, which were published from February, 2005 to February, 2006 on the journal. The frequency of statistical analysis was as follows : descriptive statistics only 24 [51.1%]. one type of statistical method 10 [21.3%], two or more methods 13 [27.6%]. An assessment of statistical aspects was performed in 24 clinical articles reporting inferential statistics. Ten articles [41.7%] did not adequately describe or reference all statistical methods used. There were six articles [25.0%] not reporting the confidence level used as the critical criteria of the statistical significance. In thirteen articles [54.2%] it seems more appropriate to implement multivariate analyses in addition to univariate analyses. We recommend that the journal readers should concentrate on improving their knowledge of basic statistics and statistical review for manuscripts submitted should be sought from professionals in the fields of biostatistics and epidemiology.

Methods and Sample Size Effect Evaluation for Wafer Level Statistical Bin Limits Determination with Poisson Distributions (포아송 분포를 가정한 Wafer 수준 Statistical Bin Limits 결정방법과 표본크기 효과에 대한 평가)

  • Park, Sung-Min;Kim, Young-Sig
    • IE interfaces
    • /
    • v.17 no.1
    • /
    • pp.1-12
    • /
    • 2004
  • In a modern semiconductor device manufacturing industry, statistical bin limits on wafer level test bin data are used for minimizing value added to defective product as well as protecting end customers from potential quality and reliability excursion. Most wafer level test bin data show skewed distributions. By Monte Carlo simulation, this paper evaluates methods and sample size effect regarding determination of statistical bin limits. In the simulation, it is assumed that wafer level test bin data follow the Poisson distribution. Hence, typical shapes of the data distribution can be specified in terms of the distribution's parameter. This study examines three different methods; 1) percentile based methodology; 2) data transformation; and 3) Poisson model fitting. The mean square error is adopted as a performance measure for each simulation scenario. Then, a case study is presented. Results show that the percentile and transformation based methods give more stable statistical bin limits associated with the real dataset. However, with highly skewed distributions, the transformation based method should be used with caution in determining statistical bin limits. When the data are well fitted to a certain probability distribution, the model fitting approach can be used in the determination. As for the sample size effect, the mean square error seems to reduce exponentially according to the sample size.

Graphical Methods for Influence Diagnostics

  • Dae Heung Jang
    • Communications for Statistical Applications and Methods
    • /
    • v.4 no.2
    • /
    • pp.359-365
    • /
    • 1997
  • Unusual observations can greatly influence the results of least wquares estimation. I propose graphical methods which can detect the influential observations.

  • PDF

A Proposal of Some Analysis Methods for Discovery of User Information from Web Data

  • Ahn, JeongYong;Han, Kyung Soo
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.1
    • /
    • pp.281-289
    • /
    • 2001
  • The continuous growth in the use of the World Wide Web is creating the data with very large scale and different types. Analyzing such data can help to determine the life time value of users, evaluate the effectiveness of web sites, and design marketing strategies and services. In this paper, we propose some analysis methods for web data and present an example of a prototypical web data analysis.

  • PDF