• Title/Summary/Keyword: statistical classifier

Search Result 158, Processing Time 0.036 seconds

A Neural Net Classifier for Hangeul Recognition (한글 인식을 위한 신경망 분류기의 응용)

  • 최원호;최동혁;이병래;박규태
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.8
    • /
    • pp.1239-1249
    • /
    • 1990
  • In this paper, using the neural network design techniques, an adaptive Mahalanobis distance classifier(AMDC) is designed. This classifier has three layers: input layer, internal layer and output layer. The connection from input layer to internal layer is fully connected, and that from internal to output layer has partial connection that might be thought as an Oring. If two ormore clusters of patterns of one class are laid apart in the feature space, the network adaptively generate the internal nodes, whhch are corresponding to the subclusters of that class. The number of the output nodes in just same as the number of the classes to classify, on the other hand, the number of the internal nodes is defined by the number of the subclusters, and can be optimized by itself. Using the method of making the subclasses, the different patterns that are of the same class can easily be distinguished from other classes. If additional training is needed after the completion of the traning, the AMDC does not have to repeat the trainging that has already done. To test the performance of the AMDC, the experiments of classifying 500 Hangeuls were done. In experiment, 20 print font sets of Hangeul characters(10,000 cahracters) were used for training, and with 3 sets(1,500 characters), the AMDC was tested for various initial variance \ulcornerand threshold \ulcorner and compared with other statistical or neural classifiers.

  • PDF

Development of Fuzzy Support Vector Machine and Evaluation of Performance Using Ionosphere Radar Data (Fuzzy Twin Support Vector Machine 개발 및 전리층 레이더 데이터를 통한 성능 평가)

  • Cheon, Min-Kyu;Yoon, Chang-Yong;Kim, Eun-Tai;Park, Mig-Non
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.4
    • /
    • pp.549-554
    • /
    • 2008
  • Support Vector machine is the classifier which is based on the statistical training theory. Twin Support Vector Machine(TWSVM) is a kind of binary classifier that determines two nonparallel planes by solving two related SVM-type problems. The training time of TWSVM is shorter than that of SVM, but TWSVM doesn't shows worse performance than that of SVM. This paper proposes the TWSVM which is applied fuzzy membership, and compares the performance of this classifier with the other classifiers using Ionosphere radar data set.

Accelerating the EM Algorithm through Selective Sampling for Naive Bayes Text Classifier (나이브베이즈 문서분류시스템을 위한 선택적샘플링 기반 EM 가속 알고리즘)

  • Chang Jae-Young;Kim Han-Joon
    • The KIPS Transactions:PartD
    • /
    • v.13D no.3 s.106
    • /
    • pp.369-376
    • /
    • 2006
  • This paper presents a new method of significantly improving conventional Bayesian statistical text classifier by incorporating accelerated EM(Expectation Maximization) algorithm. EM algorithm experiences a slow convergence and performance degrade in its iterative process, especially when real online-textual documents do not follow EM's assumptions. In this study, we propose a new accelerated EM algorithm with uncertainty-based selective sampling, which is simple yet has a fast convergence speed and allow to estimate a more accurate classification model on Naive Bayesian text classifier. Experiments using the popular Reuters-21578 document collection showed that the proposed algorithm effectively improves classification accuracy.

Classifications of Hadiths based on Supervised Learning Techniques

  • AbdElaal, Hammam M.;Bouallegue, Belgacem;Elshourbagy, Motasem;Matter, Safaa S.;AbdElghfar, Hany A.;Khattab, Mahmoud M.;Ahmed, Abdelmoty M.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.11
    • /
    • pp.1-10
    • /
    • 2022
  • This study aims to build a model is capable of classifying the categories of hadith, according to the reliability of hadith' narrators (sahih, hassan, da'if, maudu) and according to what was attributed to the Prophet Muhammad (saying, doing, describing, reporting ) using the supervised learning algorithms, with a view to discover a relationship between these classifications, based on the outputs of this model, which might be useful to avoid the controversy and useless debate on automatic classifications of hadith, using some of the statistical methods such as chi-square, information gain and association rules. The experimental results showed that there is a relation between these classifications, most of Sahih hadiths are belong to saying class, and most of maudu hadiths are belong to reporting class. Also the best classifier had given high accuracy was MultinomialNB, it achieved higher accuracy reached up to 0.9708 %, for his ability to process high dimensional problems and identifying the most important features that are relevant to target data in training stage. Followed by LinearSVC classifier, reached up to 0.9655, and finally, KNeighborsClassifier reached up to 0.9644.

Prediction of extreme PM2.5 concentrations via extreme quantile regression

  • Lee, SangHyuk;Park, Seoncheol;Lim, Yaeji
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.3
    • /
    • pp.319-331
    • /
    • 2022
  • In this paper, we develop a new statistical model to forecast the PM2.5 level in Seoul, South Korea. The proposed model is based on the extreme quantile regression model with lasso penalty. Various meteorological variables and air pollution variables are considered as predictors in the regression model, and the lasso quantile regression performs variable selection and solves the multicollinearity problem. The final prediction model is obtained by combining various extreme lasso quantile regression estimators and we construct a binary classifier based on the model. Prediction performance is evaluated through the statistical measures of the performance of a binary classification test. We observe that the proposed method works better compared to the other classification methods, and predicts 'very bad' cases of the PM2.5 level well.

R-to-R Extraction and Preprocessing Procedure for an Automated Diagnosis of Various Diseases from ECG Data

  • Timothy, Vincentius;Prihatmanto, Ary Setijadi;Rhee, Kyung-Hyune
    • Journal of Multimedia Information System
    • /
    • v.3 no.2
    • /
    • pp.1-8
    • /
    • 2016
  • In this paper, we propose a method to automatically diagnose various diseases. The input data consists of electrocardiograph (ECG) recordings. We extract R-to-R interval (RRI) signals from ECG recordings, which are preprocessed to remove trends and ectopic beats, and to keep the signal stationary. After that, we perform some prospective analysis to extract time-domain parameters, frequency-domain parameters, and nonlinear parameters of the signal. Those parameters are unique for each disease and can be used as the statistical symptoms for each disease. Then, we perform feature selection to improve the performance of the diagnosis classifier. We utilize the selected features to diagnose various diseases using machine learning. We subsequently measure the performance of the machine learning classifier to make sure that it will not misdiagnose the diseases. The first two steps, which are R-to-R extraction and preprocessing, have been successfully implemented with satisfactory results.

A Fast Method for Face Detection based on PCA and SVM

  • Xia, Chun-Lei;Shin, Hyeon-Gab;Ha, Seok-Wun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.06a
    • /
    • pp.153-156
    • /
    • 2007
  • In this paper, we propose a fast face detection approach using PCA and SVM. In our detection system, first we filter the face potential area using statistical feature which is generated by analyzing local histogram distribution. And then, we use SVM classifier to detect whether there are faces present in the test image. Support Vector Machine (SVM) has great performance in classification task. PCA is used for dimension reduction of sample data. After PCA transform, the feature vectors, which are used for training SVM classifier, are generated. Our tests in this paper are based on CMU face database.

  • PDF

Design and Implementation of Electrocardiogram Data Interpretation system using AdaBoost Algorithm (AdaBoost 알고리즘을 이용한 심전도 정보 판독 시스템의 설계 및 구현)

  • Lim, Myung-Jae;Hong, Jin-Kyoung;Kim, Kyu-Ho;Choi, Mi-Lim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.2
    • /
    • pp.129-134
    • /
    • 2010
  • Diseases such as cardiovascular illnesses, according to the National Statistical Office opened reveals that 600-800 people were killed, blood pressure, arteriosclerosis, heart disease, stroke, etc. will be a flow of blood disorders that occur in cardiovascular illnesses today are fulfilling the Master / Slave samangryulin disease appears high. Died of cardiovascular disease also told them the correct first aid survival when patients are accounted for approximately 40% of emergency rapid response is required. Therefore, this paper, the weak classifier in the AdaBoost algorithm to generate a strong classifier by combining effects throughout the analysis to measure the ECG, and cardiovascular disease that occurred to you as soon as the emergency management system that can deliver on the proposed Desk was. The electrocardiogram data measured by the ZigBee-based sensors, communication devices and emergency transport for emergency alarms in the determination and monitoring of the management desk by providing health services to enable the delivery was fast.

PCA vs. ICA for Face Recognition

  • Lee, Oyoung;Park, Hyeyoung;Park, Seung-Jin
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.873-876
    • /
    • 2000
  • The information-theoretic approach to face recognition is based on the compact coding where face images are decomposed into a small set of basis images. Most popular method for the compact coding may be the principal component analysis (PCA) which eigenface methods are based on. PCA based methods exploit only second-order statistical structure of the data, so higher- order statistical dependencies among pixels are not considered. Independent component analysis (ICA) is a signal processing technique whose goal is to express a set of random variables as linear combinations of statistically independent component variables. ICA exploits high-order statistical structure of the data that contains important information. In this paper we employ the ICA for the efficient feature extraction from face images and show that ICA outperforms the PCA in the task of face recognition. Experimental results using a simple nearest classifier and multi layer perceptron (MLP) are presented to illustrate the performance of the proposed method.

  • PDF

Hyperparameter Selection for APC-ECOC

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1219-1231
    • /
    • 2008
  • The main object of this paper is to develop a leave-one-out(LOO) bound of all pairwise comparison error correcting output codes (APC-ECOC). To avoid using classifiers whose corresponding target values are 0 in APC-ECOC and requiring pilot estimates we developed a bound based on mean misclassification probability(MMP). It can be used to tune kernel hyperparameters. Our empirical experiment using kernel mean squared estimate(KMSE) as the binary classifier indicates that the bound leads to good estimates of kernel hyperparameters.

  • PDF