• 제목/요약/키워드: Statistical classification

검색결과 1,432건 처리시간 0.033초

의사결정나무법을 이이용한 병인(病因)분류에 관한 연구 (A Study of Pathogenesis Classification using Decision Tree Method)

  • 이혁재;김민용;오환섭;박영배
    • 대한한의진단학회지
    • /
    • 제12권2호
    • /
    • pp.27-40
    • /
    • 2008
  • Background : In spite of the predominant of the theory of Pathogenesis, the method of Pathogenesis classification is depending on the doctor's clinical trials because od the lack of the objective test criteria. Methods and Results : This study is trying to improve the objectiveness of classification using a new statistical method, decision tree. Decision tree method -a classification technique in the statistical analysis- was used to analyze the result of pathogenesis questionnaire instead of using discriminant analysis. As a result, 10 among 38 pathogenesis questionnaire was selected as important questions and 12 terminal nodes was built to classify the pathogenesis. Conclusions : Using only 10 questions shown in the result of decision tree, we can classify and interpret the pathogenesis easily and effectively.

  • PDF

통계적 회귀 기법을 활용한 초음파 센서 기반의 기둥 및 차량 분류 알고리즘 (Pillar and Vehicle Classification using Ultrasonic Sensors and Statistical Regression Method)

  • 이충수;박은수;이종환;김종희;김학일
    • 제어로봇시스템학회논문지
    • /
    • 제20권4호
    • /
    • pp.428-436
    • /
    • 2014
  • This paper proposes a statistical regression method for classifying pillars and vehicles in parking area using a single ultrasonic sensor. There are three types of information provided by the ultrasonic sensor: TOF, the peak and the width of a pulse, from which 67 different features are extracted through segmentation and data preprocessing. The classification using the multiple SVM and the multinomial logistic regression are applied to the set of extracted features, and has achieved the accuracy of 85% and 89.67%, respectively, over a set of real-world data. The experimental result proves that the proposed feature extraction and classification scheme is applicable to the object classification using an ultrasonic sensor.

SSVM(Stepwise-Support Vector Machine)을 이용한 반도체 수율 예측 (A Yields Prediction in the Semiconductor Manufacturing Process Using Stepwise Support Vector Machine)

  • 안대웅;고효헌;김지현;백준걸;김성식
    • 산업공학
    • /
    • 제22권3호
    • /
    • pp.252-262
    • /
    • 2009
  • It is crucial to prevent low yields in the semiconductor industry. Since many factors affect variation in yield and they are deeply related, preventing low yield is difficult. There have been substantial researches in the field of yield prediction. Many researchers had used the statistical methods. Many studies have shown that artificial neural network (ANN) achieved better performance than traditional statistical methods. However, despite ANN's superior performance some problems such as over-fitting and poor explanatory power arise. In order to overcome these limitations, a relatively new machine learning technique, support vector machine (SVM), is introduced to classify the yield. SVM is simple enough to be analyzed mathematically, and it leads to high performances in practical applications. This study presents a new efficient classification methodology, Stepwise-SVM (SSVM), for detecting high and low yields. SSVM is step-by-step adjustment of parameters to be precisely the classification for actual high and low yield lot. The objective of this paper is to examine the feasibility of SVM and SSVM in the yield classification. The experimental results show that SVM and SSVM provides a promising alternative to yield classification for the field data.

Fault Location and Classification of Combined Transmission System: Economical and Accurate Statistic Programming Framework

  • Tavalaei, Jalal;Habibuddin, Mohd Hafiz;Khairuddin, Azhar;Mohd Zin, Abdullah Asuhaimi
    • Journal of Electrical Engineering and Technology
    • /
    • 제12권6호
    • /
    • pp.2106-2117
    • /
    • 2017
  • An effective statistical feature extraction approach of data sampling of fault in the combined transmission system is presented in this paper. The proposed algorithm leads to high accuracy at minimum cost to predict fault location and fault type classification. This algorithm requires impedance measurement data from one end of the transmission line. Modal decomposition is used to extract positive sequence impedance. Then, the fault signal is decomposed by using discrete wavelet transform. Statistical sampling is used to extract appropriate fault features as benchmark of decomposed signal to train classifier. Support Vector Machine (SVM) is used to illustrate the performance of statistical sampling performance. The overall time of sampling is not exceeding 1 1/4 cycles, taking into account the interval time. The proposed method takes two steps of sampling. The first step takes 3/4 cycle of during-fault and the second step takes 1/4 cycle of post fault impedance. The interval time between the two steps is assumed to be 1/4 cycle. Extensive studies using MATLAB software show accurate fault location estimation and fault type classification of the proposed method. The classifier result is presented and compared with well-established travelling wave methods and the performance of the algorithms are analyzed and discussed.

통계적 기법을 이용한 악성 소프트웨어 분류 (Malware classification using statistical techniques)

  • 원성민;김현주;송종우
    • 응용통계연구
    • /
    • 제30권6호
    • /
    • pp.851-865
    • /
    • 2017
  • 최근 워너크라이라는 이름의 랜섬웨어가 전 세계적으로 큰 화두에 오르면서, 악성 소프트웨어로 인한 피해를 줄이기 위한 방법들이 재조명 되고 있다. 새로운 악성 소프트웨어가 발생했을 때 피해를 최소화하기 위해서는 해당 소프트웨어가 어떤 공격 유형을 가진 악성 소프트웨어인지 빠르게 분류할 필요가 있다. 본 연구 목적은 다양한 통계적 기법을 이용하여 악성 소프트웨어를 효과적으로 분류할 수 있는 모형을 구축하는 데 있다. 모형 적합 시 다항 로지스틱, 랜덤 포레스트, 그래디언트 부스팅, 서포트 벡터 기계 등의 기법들을 이용하였으며, 본 연구를 통해 악성 소프트웨어를 분류하는 데에 있어 중요한 역할을 하는 변수들이 존재한다는 사실을 발견하였다.

통계적 패턴 분류법과 패턴 매칭을 이용한 유방영상의 미세석회화 검출 (Detection of Mammographic Microcalcifications by Statistical Pattern Classification 81 Pattern Matching)

  • 양윤석;김덕원;김은경
    • 대한의용생체공학회:의공학회지
    • /
    • 제18권4호
    • /
    • pp.357-364
    • /
    • 1997
  • 유방암은 그 조기 발견이 암환자의 사망률을 줄이는 데 있어서 가장 중요한 요소임을 알려져 있다. 스크리닝 검사에 의해 발견되는 유방암의 20%정도를 차지하는 DCIS(ductal carcinoma in situ)의 경우 미세석회화만이 필름 상에서 볼 수 있는 유일한 소견이다. 따라서 미세석회화를 발견하고 그 형태와 분포의 분석을 통한 진단이 암의 조기 발견에 매우 중요하다. 이 검출과정을 자동화하려는 시도가 디지털 영상처리 기술의 관심이 되어 왔다. 본 연구에서는 상관계수를 특징(feature)으로 사용하여 성능을 향상시킨 통계적 패턴 분류법을 제안하였다. 결과적인 검출율은 통계적 문턱치 설정에 의한 이진호 방법과 비교하여 48%에서 83%로 향상되었다. 성능은 TP와 FP로 평가되었으며 클래스 구분시의 오차도 함께 나타내었다.

  • PDF

Classification-Based Approach for Hybridizing Statistical and Rule-Based Machine Translation

  • Park, Eun-Jin;Kwon, Oh-Woog;Kim, Kangil;Kim, Young-Kil
    • ETRI Journal
    • /
    • 제37권3호
    • /
    • pp.541-550
    • /
    • 2015
  • In this paper, we propose a classification-based approach for hybridizing statistical machine translation and rulebased machine translation. Both the training dataset used in the learning of our proposed classifier and our feature extraction method affect the hybridization quality. To create one such training dataset, a previous approach used auto-evaluation metrics to determine from a set of component machine translation (MT) systems which gave the more accurate translation (by a comparative method). Once this had been determined, the most accurate translation was then labelled in such a way so as to indicate the MT system from which it came. In this previous approach, when the metric evaluation scores were low, there existed a high level of uncertainty as to which of the component MT systems was actually producing the better translation. To relax such uncertainty or error in classification, we propose an alternative approach to such labeling; that is, a cut-off method. In our experiments, using the aforementioned cut-off method in our proposed classifier, we managed to achieve a translation accuracy of 81.5% - a 5.0% improvement over existing methods.

Study on the Effect of Discrepancy of Training Sample Population in Neural Network Classification

  • Lee, Sang-Hoon;Kim, Kwang-Eun
    • 대한원격탐사학회지
    • /
    • 제18권3호
    • /
    • pp.155-162
    • /
    • 2002
  • Neural networks have been focused on as a robust classifier for the remotely sensed imagery due to its statistical independency and teaming ability. Also the artificial neural networks have been reported to be more tolerant to noise and missing data. However, unlike the conventional statistical classifiers which use the statistical parameters for the classification, a neural network classifier uses individual training sample in teaming stage. The training performance of a neural network is know to be very sensitive to the discrepancy of the number of the training samples of each class. In this paper, the effect of the population discrepancy of training samples of each class was analyzed with three layered feed forward network. And a method for reducing the effect was proposed and experimented with Landsat TM image. The results showed that the effect of the training sample size discrepancy should be carefully considered for faster and more accurate training of the network. Also, it was found that the proposed method which makes teaming rate as a function of the number of training samples in each class resulted in faster and more accurate training of the network.

Classification of algae in watersheds using elastic shape

  • Tae-Young Heo;Jaehoon Kim;Min Ho Cho
    • Communications for Statistical Applications and Methods
    • /
    • 제31권3호
    • /
    • pp.309-322
    • /
    • 2024
  • Identifying algae in water is important for managing algal blooms which have great impact on drinking water supply systems. There have been various microscopic approaches developed for algae classification. Many of them are based on the morphological features of algae. However, there have seldom been mathematical frameworks for comparing the shape of algae, represented as a planar continuous curve obtained from an image. In this work, we describe a recent framework for computing shape distance between two different algae based on the elastic metric and a novel functional representation called the square root velocity function (SRVF). We further introduce statistical procedures for multiple shapes of algae including computing the sample mean, the sample covariance, and performing the principal component analysis (PCA). Based on the shape distance, we classify six algal species in watersheds experiencing algal blooms, including three cyanobacteria (Microcystis, Oscillatoria, and Anabaena), two diatoms (Fragilaria and Synedra), and one green algae (Pediastrum). We provide and compare the classification performance of various distance-based and model-based methods. We additionally compare elastic shape distance to non-elastic distance using the nearest neighbor classifiers.

진공함침에 따른 견인전동기 고정자 코일의 부분방전 분포 해석 및 분류 (Analysis and Classification of PD Distribution for VPI Stator coil of Traction motor)

  • 박성희;강성화;임기조;장동욱;박현준
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2004년도 하계학술대회 논문집 C
    • /
    • pp.1982-1984
    • /
    • 2004
  • Stator coil of rotating machinery has shown different characteristics according to impregnated with coil or not. And this is major determinant of equipment's life. In this paper, PD characteristics is studied as a classification scheme between two specimens. Processing of the coil impregnation is very important thing because that influences on thermal and electrical characteristics of the coil. And then PD is occurring at the coil and causing insulation degradation. For processing statistical processing, PD data acquired from PD detector using PDASDA(partial discharge acquisition, storage and display system). And also these statistical distribution and parameter are applied to classify PD sources by neural networks. As a result of, Neural Networks have a good discrimination rate for classification PD sources.

  • PDF