• Title/Summary/Keyword: optimal classification method

Search Result 368, Processing Time 0.029 seconds

Extraction of Optimal Interest Points for Shape-based Image Classification (모양 기반 이미지 분류를 위한 최적의 우세점 추출)

  • 조성택;엄기현
    • Journal of KIISE:Databases
    • /
    • v.30 no.4
    • /
    • pp.362-371
    • /
    • 2003
  • In this paper, we propose an optimal interest point extraction method to support shape-base image classification and indexing for image database by applying a dynamic threshold that reflects the characteristics of the shape contour. The threshold is determined dynamically by comparing the contour length ratio of the original shape and the approximated polygon while the algorithm is running. Because our algorithm considers the characteristics of the shape contour, it can minimize the number of interest points. For n points of the contour, the proposed algorithm has O(nlogn) computational cost on an average to extract the number of m optimal interest points. Experiments were performed on the 70 synthetic shapes of 7 different contour types and 1100 fish shapes. It shows the average optimization ratio up to 0.92 and has 14% improvement, compared to the fixed threshold method. The shape features extracted from our proposed method can be used for shape-based image classification, indexing, and similarity search via normalization.

Semantic-based Genetic Algorithm for Feature Selection (의미 기반 유전 알고리즘을 사용한 특징 선택)

  • Kim, Jung-Ho;In, Joo-Ho;Chae, Soo-Hoan
    • Journal of Internet Computing and Services
    • /
    • v.13 no.4
    • /
    • pp.1-10
    • /
    • 2012
  • In this paper, an optimal feature selection method considering sematic of features, which is preprocess of document classification is proposed. The feature selection is very important part on classification, which is composed of removing redundant features and selecting essential features. LSA (Latent Semantic Analysis) for considering meaning of the features is adopted. However, a supervised LSA which is suitable method for classification problems is used because the basic LSA is not specialized for feature selection. We also apply GA (Genetic Algorithm) to the features, which are obtained from supervised LSA to select better feature subset. Finally, we project documents onto new selected feature subset and classify them using specific classifier, SVM (Support Vector Machine). It is expected to get high performance and efficiency of classification by selecting optimal feature subset using the proposed hybrid method of supervised LSA and GA. Its efficiency is proved through experiments using internet news classification with low features.

A Study on the Optimal Discriminant Model Predicting the likelihood of Insolvency for Technology Financing (기술금융을 위한 부실 가능성 예측 최적 판별모형에 대한 연구)

  • Sung, Oong-Hyun
    • Journal of Korea Technology Innovation Society
    • /
    • v.10 no.2
    • /
    • pp.183-205
    • /
    • 2007
  • An investigation was undertaken of the optimal discriminant model for predicting the likelihood of insolvency in advance for medium-sized firms based on the technology evaluation. The explanatory variables included in the discriminant model were selected by both factor analysis and discriminant analysis using stepwise selection method. Five explanatory variables were selected in factor analysis in terms of explanatory ratio and communality. Six explanatory variables were selected in stepwise discriminant analysis. The effectiveness of linear discriminant model and logistic discriminant model were assessed by the criteria of the critical probability and correct classification rate. Result showed that both model had similar correct classification rate and the linear discriminant model was preferred to the logistic discriminant model in terms of criteria of the critical probability In case of the linear discriminant model with critical probability of 0.5, the total-group correct classification rate was 70.4% and correct classification rates of insolvent and solvent groups were 73.4% and 69.5% respectively. Correct classification rate is an estimate of the probability that the estimated discriminant function will correctly classify the present sample. However, the actual correct classification rate is an estimate of the probability that the estimated discriminant function will correctly classify a future observation. Unfortunately, the correct classification rate underestimates the actual correct classification rate because the data set used to estimate the discriminant function is also used to evaluate them. The cross-validation method were used to estimate the bias of the correct classification rate. According to the results the estimated bias were 2.9% and the predicted actual correct classification rate was 67.5%. And a threshold value is set to establish an in-doubt category. Results of linear discriminant model can be applied for the technology financing banks to evaluate the possibility of insolvency and give the ranking of the firms applied.

  • PDF

On Practical Choice of Smoothing Parameter in Nonparametric Classification (베이즈 리스크를 이용한 커널형 분류에서 평활모수의 선택)

  • Kim, Rae-Sang;Kang, Kee-Hoon
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.283-292
    • /
    • 2008
  • Smoothing parameter or bandwidth plays a key role in nonparametric classification based on kernel density estimation. We consider choosing smoothing parameter in nonparametric classification, which optimize the Bayes risk. Hall and Kang (2005) clarified the theoretical properties of smoothing parameter in terms of minimizing Bayes risk and derived the optimal order of it. Bootstrap method was used in their exploring numerical properties. We compare cross-validation and bootstrap method numerically in terms of optimal order of bandwidth. Effects on misclassification rate are also examined. We confirm that bootstrap method is superior to cross-validation in both cases.

Support Vector Machine Model to Select Exterior Materials

  • Kim, Sang-Yong
    • Journal of the Korea Institute of Building Construction
    • /
    • v.11 no.3
    • /
    • pp.238-246
    • /
    • 2011
  • Choosing the best-performance materials is a crucial task for the successful completion of a project in the construction field. In general, the process of material selection is performed through the use of information by a highly experienced expert and the purchasing agent, without the assistance of logical decision-making techniques. For this reason, the construction field has considered various artificial intelligence (AI) techniques to support decision systems as their own selection method. This study proposes the application of a systematic and efficient support vector machine (SVM) model to select optimal exterior materials. The dataset of the study is 120 completed construction projects in South Korea. A total of 8 input determinants were identified and verified from the literature review and interviews with experts. Using data classification and normalization, these 120 sets were divided into 3 groups, and then 5 binary classification models were constructed in a one-against-all (OAA) multi classification method. The SVM model, based on the kernel radical basis function, yielded a prediction accuracy rate of 87.5%. This study indicates that the SVM model appears to be feasible as a decision support system for selecting an optimal construction method.

Using GA based Input Selection Method for Artificial Neural Network Modeling Application to Bankruptcy Prediction (유전자 알고리즘을 활용한 인공신경망 모형 최적입력변수의 선정 : 부도예측 모형을 중심으로)

  • 홍승현;신경식
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 1999.10a
    • /
    • pp.365-373
    • /
    • 1999
  • Recently, numerous studies have demonstrated that artificial intelligence such as neural networks can be an alternative methodology for classification problems to which traditional statistical methods have long been applied. In building neural network model, the selection of independent and dependent variables should be approached with great care and should be treated as a model construction process. Irrespective of the efficiency of a learning procedure in terms of convergence, generalization and stability, the ultimate performance of the estimator will depend on the relevance of the selected input variables and the quality of the data used. Approaches developed in statistical methods such as correlation analysis and stepwise selection method are often very useful. These methods, however, may not be the optimal ones for the development of neural network models. In this paper, we propose a genetic algorithms approach to find an optimal or near optimal input variables for neural network modeling. The proposed approach is demonstrated by applications to bankruptcy prediction modeling. Our experimental results show that this approach increases overall classification accuracy rate significantly.

  • PDF

An Integrated Method for Application-level Internet Traffic Classification

  • Choi, Mi-Jung;Park, Jun-Sang;Kim, Myung-Sup
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.3
    • /
    • pp.838-856
    • /
    • 2014
  • Enhanced network speed and the appearance of various applications have recently resulted in the rapid increase of Internet users and the explosive growth of network traffic. Under this circumstance, Internet users are eager to receive reliable and Quality of Service (QoS)-guaranteed services. To provide reliable network services, network managers need to perform control measures involving dropping or blocking each traffic type. To manage a traffic type, it is necessary to rapidly measure and correctly analyze Internet traffic as well as classify network traffic according to applications. Such traffic classification result provides basic information for ensuring service-specific QoS. Several traffic classification methodologies have been introduced; however, there has been no favorable method in achieving optimal performance in terms of accuracy, completeness, and applicability in a real network environment. In this paper, we propose a method to classify Internet traffic as the first step to provide stable network services. We integrate the existing methodologies to compensate their weaknesses and to improve the overall accuracy and completeness of the classification. We prioritize the existing methodologies, which complement each other, in our integrated classification system.

Classification of Seabed Physiognomy Based on Side Scan Sonar Images

  • Sun, Ning;Shim, Tae-Bo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.3E
    • /
    • pp.104-110
    • /
    • 2007
  • As the exploration of the seabed is extended ever further, automated recognition and classification of sonar images become increasingly important. However, most of the methods ignore the directional information and its effect on the image textures produced. To deal with this problem, we apply 2D Gabor filters to extract the features of sonar images. The filters are designed with constrained parameters to reduce the complexity and to improve the calculation efficiency. Meanwhile, at each orientation, the optimal Gabor filter parameters will be selected with the help of bandwidth parameters based on the Fisher criterion. This method can overcome some disadvantages of the traditional approaches of extracting texture features, and improve the recognition rate effectively.

Reinforcement Post-Processing and Feedback Algorithm for Optimal Combination in Bottom-Up Hierarchical Classification (상향식 계층분류의 최적화 된 병합을 위한 후처리분석과 피드백 알고리즘)

  • Choi, Yun-Jeong;Park, Seung-Soo
    • The KIPS Transactions:PartB
    • /
    • v.17B no.2
    • /
    • pp.139-148
    • /
    • 2010
  • This paper shows a reinforcement post-processing method and feedback algorithm for improvement of assigning method in classification. Especially, we focused on complex documents that are generally considered to be hard to classify. A basis factors in traditional classification system are training methodology, classification models and features of documents. The classification problem of the documents containing shared features and multiple meanings, should be deeply mined or analyzed than general formatted data. To address the problems of these document, we proposed a method to expand classification scheme using decision boundary detected automatically in our previous studies. The assigning method that a document simply decides to the top ranked category, is a main factor that we focus on. In this paper, we propose a post-processing method and feedback algorithm to analyze the relevance of ranked list. In experiments, we applied our post-processing method and one time feedback algorithm to complex documents. The experimental results show that our system does not need to change the classification algorithm itself to improve the accuracy and flexibility.

Breast Mass Classification using the Fundamental Deep Learning Approach: To build the optimal model applying various methods that influence the performance of CNN

  • Lee, Jin;Choi, Kwang Jong;Kim, Seong Jung;Oh, Ji Eun;Yoon, Woong Bae;Kim, Kwang Gi
    • Journal of Multimedia Information System
    • /
    • v.3 no.3
    • /
    • pp.97-102
    • /
    • 2016
  • Deep learning enables machines to have perception and can potentially outperform humans in the medical field. It can save a lot of time and reduce human error by detecting certain patterns from medical images without being trained. The main goal of this paper is to build the optimal model for breast mass classification by applying various methods that influence the performance of Convolutional Neural Network (CNN). Google's newly developed software library Tensorflow was used to build CNN and the mammogram dataset used in this study was obtained from 340 breast cancer cases. The best classification performance we achieved was an accuracy of 0.887, sensitivity of 0.903, and specificity of 0.869 for normal tissue versus malignant mass classification with augmented data, more convolutional filters, and ADAM optimizer. A limitation of this method, however, was that it only considered malignant masses which are relatively easier to classify than benign masses. Therefore, further studies are required in order to properly classify any given data for medical uses.