• 제목/요약/키워드: Misclassification

검색결과 226건 처리시간 0.025초

Misclassification Adjustment of Family History of Breast Cancer in a Case-Control Study: a Bayesian Approach

  • Moradzadeh, Rahmatollah;Mansournia, Mohammad Ali;Baghfalaki, Taban;Ghiasvand, Reza;Noori-Daloii, Mohammad Reza;Holakouie-Naieni, Kourosh
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권18호
    • /
    • pp.8221-8226
    • /
    • 2016
  • Background: Misreporting self-reported family history may lead to biased estimations. We used Bayesian methods to adjust for exposure misclassification. Materials and Methods: A hospital-based case-control study was used to identify breast cancer risk factors among Iranian women. Three models were jointly considered; an outcome, an exposure and a measurement model. All models were fitted using Bayesian methods, run to achieve convergence. Results: Bayesian analysis in the model without misclassification showed that the odds ratios for the relationship between breast cancer and a family history in different prior distributions were 2.98 (95% CRI: 2.41, 3.71), 2.57 (95% CRI: 1.95, 3.41) and 2.53 (95% CRI: 1.93, 3.31). In the misclassified model, adjusted odds ratios for misclassification in the different situations were 2.64 (95% CRI: 2.02, 3.47), 2.64 (95% CRI: 2.02, 3.46), 1.60 (95% CRI: 1.07, 2.38), 1.61 (95% CRI: 1.07, 2.40), 1.57 (95% CRI: 1.05, 2.35), 1.58 (95% CRI: 1.06, 2.34) and 1.57 (95% CRI: 1.06, 2.33). Conclusions: It was concluded that self-reported family history may be misclassified in different scenarios. Due to the lack of validation studies in Iran, more attention to this matter in future research is suggested, especially while obtaining results in accordance with sensitivity and specificity values.

측정오류를 고려한 가드밴드 기반 스크리닝 검사방식의 설계 (Design of Screening Inspection Procedures Based on Guard Bands Considering Measurement Errors)

  • 김영진
    • 품질경영학회지
    • /
    • 제41권4호
    • /
    • pp.673-681
    • /
    • 2013
  • Purpose: The purpose of this study is to investigate the design optimization modeling of screening procedures based on the assessment of misclassification errors. Methods: Misclassification errors due to measurement variability are derived for normally distributed quality characteristics. Further, an optimization model for ensuring the level of outgoing quality is proposed and demonstrated through an illustrative example. Results: It is shown that two types of misclassification errors (i.e., false acceptance and false rejection) may be properly compromised through an analytical assessment of measurement errors and an optimization modeling. It is also discussed that a variety of optimization modeling may be enabled based on the derivation of measurement errors. Conclusion: It may be concluded that the design of screening inspection may further be facilitated by including the effect of measurement errors on the performance of screening inspection procedure.

MISCLASSIFICATION IN SIZE-BIASED MODIFIED POWER SERIES DISTRIBUTION AND ITS APPLICATIONS

  • Hassan, Anwar;Ahmad, Peer Bilal
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제13권1호
    • /
    • pp.55-72
    • /
    • 2009
  • A misclassified size-biased modified power series distribution (MSBMPSD) where some of the observations corresponding to x = c + 1 are misclassified as x = c with probability $\alpha$, is defined. We obtain its recurrence relations among the raw moments, the central moments and the factorial moments. Discussion of the effect of the misclassification on the variance is considered. To illustrate the situation under consideration some of its particular cases like the size-biased generalized negative binomial (SBGNB), the size-biased generalized Poisson (SBGP) and sizebiased Borel distributions are included. Finally, an example is presented for the size-biased generalized Poisson distribution to illustrate the results.

  • PDF

Estimating Prediction Errors in Binary Classification Problem: Cross-Validation versus Bootstrap

  • Kim Ji-Hyun;Cha Eun-Song
    • Communications for Statistical Applications and Methods
    • /
    • 제13권1호
    • /
    • pp.151-165
    • /
    • 2006
  • It is important to estimate the true misclassification rate of a given classifier when an independent set of test data is not available. Cross-validation and bootstrap are two possible approaches in this case. In related literature bootstrap estimators of the true misclassification rate were asserted to have better performance for small samples than cross-validation estimators. We compare the two estimators empirically when the classification rule is so adaptive to training data that its apparent misclassification rate is close to zero. We confirm that bootstrap estimators have better performance for small samples because of small variance, and we have found a new fact that their bias tends to be significant even for moderate to large samples, in which case cross-validation estimators have better performance with less computation.

IMAGE SEGMENTATION BASED ON THE STATISTICAL VARIATIONAL FORMULATION USING THE LOCAL REGION INFORMATION

  • Park, Sung Ha;Lee, Chang-Ock;Hahn, Jooyoung
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제18권2호
    • /
    • pp.129-142
    • /
    • 2014
  • We propose a variational segmentation model based on statistical information of intensities in an image. The model consists of both a local region-based energy and a global region-based energy in order to handle misclassification which happens in a typical statistical variational model with an assumption that an image is a mixture of two Gaussian distributions. We find local ambiguous regions where misclassification might happen due to a small difference between two Gaussian distributions. Based on statistical information restricted to the local ambiguous regions, we design a local region-based energy in order to reduce the misclassification. We suggest an algorithm to avoid the difficulty of the Euler-Lagrange equations of the proposed variational model.

Evaluating Predictive Ability of Classification Models with Ordered Multiple Categories

  • Oong-Hyun Sung
    • Communications for Statistical Applications and Methods
    • /
    • 제6권2호
    • /
    • pp.383-395
    • /
    • 1999
  • This study is concerned with the evaluation of predictive ability of classification models with ordered multiple categories. If categories can be ordered or ranked the spread of misclassification should be considered to evaluate the performance of the classification models using loss rate since the apparent error rate can not measure the spread of misclassification. Since loss rate is known to underestimate the true loss rate the bootstrap method were used to estimate the true loss rate. thus this study suggests the method to evaluate the predictive power of the classification models using loss rate and the bootstrap estimate of the true loss rate.

  • PDF

Analyzing Customer Management Data by Data Mining: Case Study on Chum Prediction Models for Insurance Company in Korea

  • Cho, Mee-Hye;Park, Eun-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권4호
    • /
    • pp.1007-1018
    • /
    • 2008
  • The purpose of this case study is to demonstrate database-marketing management. First, we explore original variables for insurance customer's data, modify them if necessary, and go through variable selection process before analysis. Then, we develop churn prediction models using logistic regression, neural network and SVM analysis. We also compare these three data mining models in terms of misclassification rate.

  • PDF

Input Noise Immunity of Multilayer Perceptrons

  • Lee, Young-Jik;Oh, Sang-Hoon
    • ETRI Journal
    • /
    • 제16권1호
    • /
    • pp.35-43
    • /
    • 1994
  • In this paper, the robustness of the artificial neural networks to noise is demonstrated with a multilayer perceptron, and the reason of robustness is due to the statistical orthogonality among hidden nodes and its hierarchical information extraction capability. Also, the misclassification probability of a well-trained multilayer perceptron is derived without any linear approximations when the inputs are contaminated with random noises. The misclassification probability for a noisy pattern is shown to be a function of the input pattern, noise variances, the weight matrices, and the nonlinear transformations. The result is verified with a handwritten digit recognition problem, which shows better result than that using linear approximations.

  • PDF

불균형 자료에 대한 분류분석 (Classification Analysis for Unbalanced Data)

  • 김동아;강수연;송종우
    • 응용통계연구
    • /
    • 제28권3호
    • /
    • pp.495-509
    • /
    • 2015
  • 일반적인 2집단 분류(2-class classification)의 경우, 두 집단의 비율이 크게 차이나지 않는 경우가 많다. 본 논문에서는 두 집단의 비율이 크게 차이나는 불균형 데이터(unbalanced data)의 분류 문제에 대해서 다루고자 한다. 불균형 데이터의 분류방법은 균형이 맞는 데이터(balanced data)의 경우보다 분류하기 어려운 경우가 많다. 이런 자료에서 보통의 분류모형을 적용하게 되면 많은 경우에 대부분의 관측치가 큰 집단으로 분류 되는 경우가 많은데 실질적인 어플리케이션에서는 이런 오분류가 손해가 더 큰 경우가 대부분이다. 우리는 sampling 기법을 이용하여 다양한 분류 방법론의 성능을 비교 분석 하였다. 또한 비대칭 손실(asymmetric loss)을 가정한 경우에 어떤 방법론이 가장 작은 loss를 생성하는 지를 비교하였다. 성능 비교를 위해서는 오분류율(misclassification rate), G-mean, ROC, 그리고 AUC(Area under the curve) 등을 이용하였다.

딥뉴럴네트워크 상에 신속한 오인식 샘플 생성 공격 (Rapid Misclassification Sample Generation Attack on Deep Neural Network)

  • 권현;박상준;김용철
    • 융합보안논문지
    • /
    • 제20권2호
    • /
    • pp.111-121
    • /
    • 2020
  • 딥뉴럴네트워크는 머신러닝 분야 중 이미지 인식, 사물 인식 등에 좋은 성능을 보여주고 있다. 그러나 딥뉴럴네트워크는 적대적 샘플(Adversarial example)에 취약점이 있다. 적대적 샘플은 원본 샘플에 최소한의 noise를 넣어서 딥뉴럴네트워크가 잘못 인식하게 하는 샘플이다. 그러나 이러한 적대적 샘플은 원본 샘플간의 최소한의 noise을 주면서 동시에 딥뉴럴네트워크가 잘못 인식하도록 하는 샘플을 생성하는 데 시간이 많이 걸린다는 단점이 있다. 따라서 어떠한 경우에 최소한의 noise가 아니더라도 신속하게 딥뉴럴네트워크가 잘못 인식하도록 하는 공격이 필요할 수 있다. 이 논문에서, 우리는 신속하게 딥뉴럴네트워크를 공격하는 것에 우선순위를 둔 신속한 오인식 샘플 생성 공격을 제안하고자 한다. 이 제안방법은 원본 샘플에 대한 왜곡을 고려하지 않고 딥뉴럴네트워크의 오인식에 중점을 둔 noise를 추가하는 방식이다. 따라서 이 방법은 기존방법과 달리 별도의 원본 샘플에 대한 왜곡을 고려하지 않기 때문에 기존방법보다 생성속도가 빠른 장점이 있다. 실험데이터로는 MNIST와 CIFAR10를 사용하였으며 머신러닝 라이브러리로 Tensorflow를 사용하였다. 실험결과에서, 제안한 오인식 샘플은 기존방법에 비해서 MNIST와 CIFAR10에서 각각 50%, 80% 감소된 반복횟수이면서 100% 공격률을 가진다.