• Title/Summary/Keyword: misclassification

Search Result 231, Processing Time 0.028 seconds

Misclassification Adjustment of Family History of Breast Cancer in a Case-Control Study: a Bayesian Approach

  • Moradzadeh, Rahmatollah;Mansournia, Mohammad Ali;Baghfalaki, Taban;Ghiasvand, Reza;Noori-Daloii, Mohammad Reza;Holakouie-Naieni, Kourosh
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.18
    • /
    • pp.8221-8226
    • /
    • 2016
  • Background: Misreporting self-reported family history may lead to biased estimations. We used Bayesian methods to adjust for exposure misclassification. Materials and Methods: A hospital-based case-control study was used to identify breast cancer risk factors among Iranian women. Three models were jointly considered; an outcome, an exposure and a measurement model. All models were fitted using Bayesian methods, run to achieve convergence. Results: Bayesian analysis in the model without misclassification showed that the odds ratios for the relationship between breast cancer and a family history in different prior distributions were 2.98 (95% CRI: 2.41, 3.71), 2.57 (95% CRI: 1.95, 3.41) and 2.53 (95% CRI: 1.93, 3.31). In the misclassified model, adjusted odds ratios for misclassification in the different situations were 2.64 (95% CRI: 2.02, 3.47), 2.64 (95% CRI: 2.02, 3.46), 1.60 (95% CRI: 1.07, 2.38), 1.61 (95% CRI: 1.07, 2.40), 1.57 (95% CRI: 1.05, 2.35), 1.58 (95% CRI: 1.06, 2.34) and 1.57 (95% CRI: 1.06, 2.33). Conclusions: It was concluded that self-reported family history may be misclassified in different scenarios. Due to the lack of validation studies in Iran, more attention to this matter in future research is suggested, especially while obtaining results in accordance with sensitivity and specificity values.

Design of Screening Inspection Procedures Based on Guard Bands Considering Measurement Errors (측정오류를 고려한 가드밴드 기반 스크리닝 검사방식의 설계)

  • Kim, Young Jin
    • Journal of Korean Society for Quality Management
    • /
    • v.41 no.4
    • /
    • pp.673-681
    • /
    • 2013
  • Purpose: The purpose of this study is to investigate the design optimization modeling of screening procedures based on the assessment of misclassification errors. Methods: Misclassification errors due to measurement variability are derived for normally distributed quality characteristics. Further, an optimization model for ensuring the level of outgoing quality is proposed and demonstrated through an illustrative example. Results: It is shown that two types of misclassification errors (i.e., false acceptance and false rejection) may be properly compromised through an analytical assessment of measurement errors and an optimization modeling. It is also discussed that a variety of optimization modeling may be enabled based on the derivation of measurement errors. Conclusion: It may be concluded that the design of screening inspection may further be facilitated by including the effect of measurement errors on the performance of screening inspection procedure.

MISCLASSIFICATION IN SIZE-BIASED MODIFIED POWER SERIES DISTRIBUTION AND ITS APPLICATIONS

  • Hassan, Anwar;Ahmad, Peer Bilal
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.13 no.1
    • /
    • pp.55-72
    • /
    • 2009
  • A misclassified size-biased modified power series distribution (MSBMPSD) where some of the observations corresponding to x = c + 1 are misclassified as x = c with probability $\alpha$, is defined. We obtain its recurrence relations among the raw moments, the central moments and the factorial moments. Discussion of the effect of the misclassification on the variance is considered. To illustrate the situation under consideration some of its particular cases like the size-biased generalized negative binomial (SBGNB), the size-biased generalized Poisson (SBGP) and sizebiased Borel distributions are included. Finally, an example is presented for the size-biased generalized Poisson distribution to illustrate the results.

  • PDF

Estimating Prediction Errors in Binary Classification Problem: Cross-Validation versus Bootstrap

  • Kim Ji-Hyun;Cha Eun-Song
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.1
    • /
    • pp.151-165
    • /
    • 2006
  • It is important to estimate the true misclassification rate of a given classifier when an independent set of test data is not available. Cross-validation and bootstrap are two possible approaches in this case. In related literature bootstrap estimators of the true misclassification rate were asserted to have better performance for small samples than cross-validation estimators. We compare the two estimators empirically when the classification rule is so adaptive to training data that its apparent misclassification rate is close to zero. We confirm that bootstrap estimators have better performance for small samples because of small variance, and we have found a new fact that their bias tends to be significant even for moderate to large samples, in which case cross-validation estimators have better performance with less computation.

IMAGE SEGMENTATION BASED ON THE STATISTICAL VARIATIONAL FORMULATION USING THE LOCAL REGION INFORMATION

  • Park, Sung Ha;Lee, Chang-Ock;Hahn, Jooyoung
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.18 no.2
    • /
    • pp.129-142
    • /
    • 2014
  • We propose a variational segmentation model based on statistical information of intensities in an image. The model consists of both a local region-based energy and a global region-based energy in order to handle misclassification which happens in a typical statistical variational model with an assumption that an image is a mixture of two Gaussian distributions. We find local ambiguous regions where misclassification might happen due to a small difference between two Gaussian distributions. Based on statistical information restricted to the local ambiguous regions, we design a local region-based energy in order to reduce the misclassification. We suggest an algorithm to avoid the difficulty of the Euler-Lagrange equations of the proposed variational model.

Evaluating Predictive Ability of Classification Models with Ordered Multiple Categories

  • Oong-Hyun Sung
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.2
    • /
    • pp.383-395
    • /
    • 1999
  • This study is concerned with the evaluation of predictive ability of classification models with ordered multiple categories. If categories can be ordered or ranked the spread of misclassification should be considered to evaluate the performance of the classification models using loss rate since the apparent error rate can not measure the spread of misclassification. Since loss rate is known to underestimate the true loss rate the bootstrap method were used to estimate the true loss rate. thus this study suggests the method to evaluate the predictive power of the classification models using loss rate and the bootstrap estimate of the true loss rate.

  • PDF

Analyzing Customer Management Data by Data Mining: Case Study on Chum Prediction Models for Insurance Company in Korea

  • Cho, Mee-Hye;Park, Eun-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1007-1018
    • /
    • 2008
  • The purpose of this case study is to demonstrate database-marketing management. First, we explore original variables for insurance customer's data, modify them if necessary, and go through variable selection process before analysis. Then, we develop churn prediction models using logistic regression, neural network and SVM analysis. We also compare these three data mining models in terms of misclassification rate.

  • PDF

Input Noise Immunity of Multilayer Perceptrons

  • Lee, Young-Jik;Oh, Sang-Hoon
    • ETRI Journal
    • /
    • v.16 no.1
    • /
    • pp.35-43
    • /
    • 1994
  • In this paper, the robustness of the artificial neural networks to noise is demonstrated with a multilayer perceptron, and the reason of robustness is due to the statistical orthogonality among hidden nodes and its hierarchical information extraction capability. Also, the misclassification probability of a well-trained multilayer perceptron is derived without any linear approximations when the inputs are contaminated with random noises. The misclassification probability for a noisy pattern is shown to be a function of the input pattern, noise variances, the weight matrices, and the nonlinear transformations. The result is verified with a handwritten digit recognition problem, which shows better result than that using linear approximations.

  • PDF

Classification Analysis for Unbalanced Data (불균형 자료에 대한 분류분석)

  • Kim, Dongah;Kang, Suyeon;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.3
    • /
    • pp.495-509
    • /
    • 2015
  • We study a classification problem of significant differences in the proportion of two groups known as the unbalanced classification problem. It is usually more difficult to classify classes accurately in unbalanced data than balanced data. Most observations are likely to be classified to the bigger group if we apply classification methods to the unbalanced data because it can minimize the misclassification loss. However, this smaller group is misclassified as the larger group problem that can cause a bigger loss in most real applications. We compare several classification methods for the unbalanced data using sampling techniques (up and down sampling). We also check the total loss of different classification methods when the asymmetric loss is applied to simulated and real data. We use the misclassification rate, G-mean, ROC and AUC (area under the curve) for the performance comparison.

Rapid Misclassification Sample Generation Attack on Deep Neural Network (딥뉴럴네트워크 상에 신속한 오인식 샘플 생성 공격)

  • Kwon, Hyun;Park, Sangjun;Kim, Yongchul
    • Convergence Security Journal
    • /
    • v.20 no.2
    • /
    • pp.111-121
    • /
    • 2020
  • Deep neural networks (DNNs) provide good performance for machine learning tasks such as image recognition and object recognition. However, DNNs are vulnerable to an adversarial example. An adversarial example is an attack sample that causes the neural network to recognize it incorrectly by adding minimal noise to the original sample. However, the disadvantage is that it takes a long time to generate such an adversarial example. Therefore, in some cases, an attack may be necessary that quickly causes the neural network to recognize it incorrectly. In this paper, we propose a fast misclassification sample that can rapidly attack neural networks. The proposed method does not consider the distortion of the original sample when adding noise. We used MNIST and CIFAR10 as experimental data and Tensorflow as a machine learning library. Experimental results show that the fast misclassification sample generated by the proposed method can be generated with 50% and 80% reduced number of iterations for MNIST and CIFAR10, respectively, compared to the conventional Carlini method, and has 100% attack rate.