• Title/Summary/Keyword: Classification Performance

Search Result 3,802, Processing Time 0.03 seconds

Statistical Information-Based Hierarchical Fuzzy-Rough Classification Approach (통계적 정보기반 계층적 퍼지-러프 분류기법)

  • Son, Chang-S.;Seo, Suk-T.;Chung, Hwan-M.;Kwon, Soon-H.
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.6
    • /
    • pp.792-798
    • /
    • 2007
  • In this paper, we propose a hierarchical fuzzy-rough classification method based on statistical information for maximizing the performance of pattern classification and reducing the number of rules without learning approaches such as neural network, genetic algorithm. In the proposed method, statistical information is used for extracting the partition intervals of antecedent fuzzy sets at each layer on hierarchical fuzzy-rough classification systems and rough sets are used for minimizing the number of fuzzy if-then rules which are associated with the partition intervals extracted by statistical information. To show the effectiveness of the proposed method, we compared the classification results(e.g. the classification accuracy and the number of rules) of the proposed with those of the conventional methods on the Fisher's IRIS data. From the experimental results, we can confirm the fact that the proposed method considers only statistical information of the given data is similar to the classification performance of the conventional methods.

Combining Multiple Classifiers for Automatic Classification of Email Documents (전자우편 문서의 자동분류를 위한 다중 분류기 결합)

  • Lee, Jae-Haeng;Cho, Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.192-201
    • /
    • 2002
  • Automated text classification is considered as an important method to manage and process a huge amount of documents in digital forms that are widespread and continuously increasing. Recently, text classification has been addressed with machine learning technologies such as k-nearest neighbor, decision tree, support vector machine and neural networks. However, only few investigations in text classification are studied on real problems but on well-organized text corpus, and do not show their usefulness. This paper proposes and analyzes text classification methods for a real application, email document classification task. First, we propose a combining method of multiple neural networks that improves the performance through the combinations with maximum and neural networks. Second, we present another strategy of combining multiple machine learning classifiers. Voting, Borda count and neural networks improve the overall classification performance. Experimental results show the usefulness of the proposed methods for a real application domain, yielding more than 90% precision rates.

Comparative Study of Various Machine-learning Features for Tweets Sentiment Classification (트윗 감정 분류를 위한 다양한 기계학습 자질에 대한 비교 연구)

  • Hong, Cho-Hee;Kim, Hark-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.12
    • /
    • pp.471-478
    • /
    • 2012
  • Various studies on sentiment classification of documents have been performed. Recently, they have been applied to twitter sentiment classification. However, they did not show good performances because they did not consider the characteristics of tweets such as tweet structure, emoticons, spelling errors, and newly-coined words. In this paper, we perform experiments on various input features (emoticon polarity, retweet polarity, author polarity, and replacement words) which affect twitter sentiment classification model based on machine-learning techniques. In the experiments with a sentiment classification model based on a support vector machine, we found that the emoticon polarity features and the author polarity features can contribute to improve the performance of a twitter sentiment classification model. Then, we found that the retweet polarity features and the replacement words features do not affect the performance of a twitter sentiment classification model contrary to our expectations.

Rock Mass Classification and Its Use in Blast Design for Tunneling (암분류기법과 터널굴착을 위한 발파설계에의 활용)

  • Ryu Chang-Ha;SunWoo Choon;Choi Byung-Hee
    • Explosives and Blasting
    • /
    • v.24 no.1
    • /
    • pp.63-69
    • /
    • 2006
  • Building tunnels means dealing with what rock is encountered. Relocation of the site of the underground structure is rarely possible. Tunneling engineers and miners have to cope with the quality of the rock mass as it is. Different tunneling philosophies and different rock classification methods have been developed in various countries. Most of the rock classification methods are based on the response of the rock mass to the excavation. Tunnel support requirements could be assessed analytically, supplemented by rock mass classification predictions, and verified by measurements during construction. Rock mass classifications on their own should only be used for preliminary, planning purposes and not for final tunnel support. Design of blast pattern in tunneling projects in Korea is also mostly prepared according to the general rock classification methods such as RMR or Q. They, however, do not take into account the blast performance, and as a consequence, produce poor blasting results. In this paper, the methods of general rock classification and blast design for tunnel excavation in Korea are reviewed, and efforts to develop a new classification method, reflecting the blasting performance, are presented.

Movie Popularity Classification Based on Support Vector Machine Combined with Social Network Analysis

  • Dorjmaa, Tserendulam;Shin, Taeksoo
    • Journal of Information Technology Services
    • /
    • v.16 no.3
    • /
    • pp.167-183
    • /
    • 2017
  • The rapid growth of information technology and mobile service platforms, i.e., internet, google, and facebook, etc. has led the abundance of data. Due to this environment, the world is now facing a revolution in the process that data is searched, collected, stored, and shared. Abundance of data gives us several opportunities to knowledge discovery and data mining techniques. In recent years, data mining methods as a solution to discovery and extraction of available knowledge in database has been more popular in e-commerce service fields such as, in particular, movie recommendation. However, most of the classification approaches for predicting the movie popularity have used only several types of information of the movie such as actor, director, rating score, language and countries etc. In this study, we propose a classification-based support vector machine (SVM) model for predicting the movie popularity based on movie's genre data and social network data. Social network analysis (SNA) is used for improving the classification accuracy. This study builds the movies' network (one mode network) based on initial data which is a two mode network as user-to-movie network. For the proposed method we computed degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality as centrality measures in movie's network. Those four centrality values and movies' genre data were used to classify the movie popularity in this study. The logistic regression, neural network, $na{\ddot{i}}ve$ Bayes classifier, and decision tree as benchmarking models for movie popularity classification were also used for comparison with the performance of our proposed model. To assess the classifier's performance accuracy this study used MovieLens data as an open database. Our empirical results indicate that our proposed model with movie's genre and centrality data has by approximately 0% higher accuracy than other classification models with only movie's genre data. The implications of our results show that our proposed model can be used for improving movie popularity classification accuracy.

Performance Improvement of Web Document Classification through Incorporation of Feature Selection and Weighting (특징선택과 특징가중의 융합을 통한 웹문서분류 성능의 개선)

  • Lee, Ah-Ram;Kim, Han-Joon;Man, Xuan
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.4
    • /
    • pp.141-148
    • /
    • 2013
  • Automated classification systems which utilize machine learning develops classification models through learning process, and then classify unknown data into predefined set of categories according to the model. The performance of machine learning-based classification systems relies greatly upon the quality of features composing classification models. For textual data, we can use their word terms and structure information in order to generate the set of features. Particularly, in order to extract feature from Web documents, we need to analyze tag and hyperlink information. Recent studies on Web document classification focus on feature engineering technology other than machine learning algorithms themselves. Thus this paper proposes a novel method of incorporating feature selection and weighting which can improves classification models effectively. Through extensive experiments using Web-KB document collections, the proposed method outperforms conventional ones.

Gait-Based Gender Classification Using a Correlation-Based Feature Selection Technique

  • Beom Kwon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.55-66
    • /
    • 2024
  • Gender classification techniques have received a lot of attention from researchers because they can be used in various fields such as forensics, surveillance systems, and demographic studies. As previous studies have shown that there are distinctive features between male and female gait, various techniques have been proposed to classify gender from three dimensional(3-D) gait data. However, some of the gait features extracted from 3-D gait data using existing techniques are similar or redundant to each other or do not help in gender classification. In this study, we propose a method to select features that are useful for gender classification using a correlation-based feature selection technique. To demonstrate the effectiveness of the proposed feature selection technique, we compare the performance of gender classification models before and after applying the proposed feature selection technique using a 3-D gait dataset available on the Internet. Eight machine learning algorithms applicable to binary classification problems were utilized in the experiments. The experimental results show that the proposed feature selection technique can reduce the number of features by 22, from 82 to 60, while maintaining the gender classification performance.

Improving the Performance of SVM Text Categorization with Inter-document Similarities (문헌간 유사도를 이용한 SVM 분류기의 문헌분류성능 향상에 관한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.22 no.3 s.57
    • /
    • pp.261-287
    • /
    • 2005
  • The purpose of this paper is to explore the ways to improve the performance of SVM (Support Vector Machines) text classifier using inter-document similarities. SVMs are powerful machine learning systems, which are considered as the state-of-the-art technique for automatic document classification. In this paper text categorization via SVMs approach based on feature representation with document vectors is suggested. In this approach, document vectors instead of index terms are used as features, and vector similarities instead of term weights are used as feature values. Experiments show that SVM classifier with document vector features can improve the document classification performance. For the sake of run-time efficiency, two methods are developed: One is to select document vector features, and the other is to use category centroid vector features instead. Experiments on these two methods show that we can get improved performance with small vector feature set than the performance of conventional methods with index term features.

Performance Evaluation of SG Tube Defect Size Estimation System in the Absence of Defect Type Classification (결함 형태 분류 과정이 필요없는 SG 세관 결함 크기 추정 시스템의 성능 평가)

  • Jo, Nam-Hoon
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.30 no.1
    • /
    • pp.13-19
    • /
    • 2010
  • In this paper, we study a new estimation system for the prediction of steam generator tube defects. In the previous research works, defect size estimators were independently designed for each defect types in order to estimate the defect size. As a result, the structure of estimation system is rather complex and the estimation performance gets worse if the classification performance is degraded for some reason. This paper studies a new estimation system that does not require the classification of defect types. Although the previous works are expected to achieve much better estimation performance than the proposed system since it uses the estimator specialized in each defect, the performance difference is not so large. Therefore, it is expected that the proposed estimator can be effectively used for the case where the defect type classification is imperfect.

Performance Comparison for Radar Target Classification of Monostatic RCS and Bistatic RCS (모노스태틱 RCS와 바이스태틱 RCS의 표적 구분 성능 분석)

  • Lee, Sung-Jun;Choi, In-Sik
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.21 no.12
    • /
    • pp.1460-1466
    • /
    • 2010
  • In this paper, we analyzed the performance of radar target classification using the monostatic and bistatic radar cross section(RCS) for four different wire targets. Short time Fourier transform(STFT) and continuous wavelet transform (CWT) were used for feature extraction from the monostatic RCS and the bistatic RCS of each target, and a multi-layered perceptron(MLP) neural network was used as a classifier. Results show that CWT yields better performance than STFT for both the monostatic RCS and the bistatic RCS. And, when STFT was used, the performance of the bistatic RCS was slightly better than that of the monostatic RCS. However, when CWT was used, the performance of the monostatic RCS was slightly better than that of the bistatic RCS. Resultingly, it is proven that bistatic RCS is a good cadndidate for application to radar target classification in combination with a monostatic RCS.