• 제목/요약/키워드: Classification Algorithms

검색결과 1,173건 처리시간 0.048초

A Sparse Target Matrix Generation Based Unsupervised Feature Learning Algorithm for Image Classification

  • Zhao, Dan;Guo, Baolong;Yan, Yunyi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권6호
    • /
    • pp.2806-2825
    • /
    • 2018
  • Unsupervised learning has shown good performance on image, video and audio classification tasks, and much progress has been made so far. It studies how systems can learn to represent particular input patterns in a way that reflects the statistical structure of the overall collection of input patterns. Many promising deep learning systems are commonly trained by the greedy layerwise unsupervised learning manner. The performance of these deep learning architectures benefits from the unsupervised learning ability to disentangling the abstractions and picking out the useful features. However, the existing unsupervised learning algorithms are often difficult to train partly because of the requirement of extensive hyperparameters. The tuning of these hyperparameters is a laborious task that requires expert knowledge, rules of thumb or extensive search. In this paper, we propose a simple and effective unsupervised feature learning algorithm for image classification, which exploits an explicit optimizing way for population and lifetime sparsity. Firstly, a sparse target matrix is built by the competitive rules. Then, the sparse features are optimized by means of minimizing the Euclidean norm ($L_2$) error between the sparse target and the competitive layer outputs. Finally, a classifier is trained using the obtained sparse features. Experimental results show that the proposed method achieves good performance for image classification, and provides discriminative features that generalize well.

의미적 연관태그와 이미지 내용정보를 이용한 웹 이미지 분류 (Web Image Classification using Semantically Related Tags and Image Content)

  • 조수선
    • 인터넷정보학회논문지
    • /
    • 제11권3호
    • /
    • pp.15-24
    • /
    • 2010
  • 본 논문에서는 대용량 온라인 이미지 공유 사이트를 적용 도메인으로 하여 이미지 검색의 만족도를 높이고자 태그의 의미적 연관성과 이미지 자체의 내용 정보를 결합하는 이미지 분류 방법을 제안한다. 이미지 검색 및 분류 알고리즘이 플리커와 같은 대용량 이미지 공유 사이트에서 활용될 수 있으려면 실제 웹상의 태깅된 이미지를 대상으로 한 적용이 가능해야 한다. 제안된 알고리즘은 'bag of visual word'기반의 이미지 내용으로 웹 이미지를 분류하기 위한 것으로서, 의미적 연관태그를 이용해 일차 검색된 이미지들을 훈련 데이터로 사용하여 카테고리 모델을 훈련하고, PLSA를 적용하여 평가 이미지들을 분류하는 것이다. 제안된 방법으로 플리커의 웹 이미지들을 대상으로 실험한 결과, 태그 정보를 이용한 기존의 방법에 비해 우수한 검색 정확도 및 재현율을 확인할 수 있었다.

디자인 패턴 구조를 이용한 클러스터링에 관한 연구 (A Study on Clustering Algorithm Using Design Pattern Structure)

  • 한정수;김귀정
    • 한국콘텐츠학회논문지
    • /
    • 제2권1호
    • /
    • pp.68-76
    • /
    • 2002
  • 클러스터링은 부품 분류의 대표적인 방법인데, 클래스나 모듈의 응집도와 결합도를 이용한 기존의 클러스터링 방법은 클래스간의 관계에 중점을 둔 디자인 패턴을 기존의 클러스터링 방법을 이용하는 것은 효과적일 수 있다. 본 논문에서는 디자인 패턴을 분류하기 위해 패턴 구조의 특성을 가지고 분류하였다. 그리고 클러스터링에 의한 분류는 패싯 분류에 의한 방법보다 높은 정확도를 보여주었다. 따라서 자동화된 분류방법인 클러스터링 알고리즘을 사용하여 디자인 패턴을 분류하는 것이 효과적이라 할 수 있다. 디자인 패턴의 분류는 검색 시 유사한 패턴들이 같은 카테고리에 저장이 되므로 유사 패턴을 비교하여 사용할 수 있으며, 패턴 클러스터링에 의해 분류되고, 패턴의 링크정보를 이용하여 저장하므로 저장소를 효율적으로 관리할 수 있다.

  • PDF

텍스트 마이닝을 이용한 XML 문서 분류 기술 (Classification Techniques for XML Document Using Text Mining)

  • 김천식;홍유식
    • 한국컴퓨터정보학회논문지
    • /
    • 제11권2호
    • /
    • pp.15-23
    • /
    • 2006
  • 인터넷에는 많은 문서가 있고 지금도 새로운 문서가 만들어지고 있다. 따라서 인터넷에 존재하는 문서를 의미 있게 분류하는 것은 향후 문서의 관리 및 질의처리에서 중요한 문제이다. 하지만 지금까지 대부분은 키워드에 기초한 문서 분류방법을 사용하고 있다. 이 방법은 문서를 효율적으로 분류하지 못했다. 또한 의미를 포함한 문서의 분류를 하지 못한다. 사람이 문서를 꼼꼼하게 읽어서 문서를 분류하는 방법이 최선이지만, 시간적인 면이나 효율성에 문제가 있다. 따라서 본 논문에서는 신경망 알고리즘과 C4.5 알고리즘을 이용하여 문서를 분류하고자 한다. 실험 데이터로 XML로 만들어진 이력서 데이터를 사용하여 실험하였다. 실험결과 문서 분류에 가능성을 보였다. 또한, 다양한 문서 분류 응용에 적용하여 좋은 결과를 얻을 것으로 기대한다.

  • PDF

Conditional Mutual Information-Based Feature Selection Analyzing for Synergy and Redundancy

  • Cheng, Hongrong;Qin, Zhiguang;Feng, Chaosheng;Wang, Yong;Li, Fagen
    • ETRI Journal
    • /
    • 제33권2호
    • /
    • pp.210-218
    • /
    • 2011
  • Battiti's mutual information feature selector (MIFS) and its variant algorithms are used for many classification applications. Since they ignore feature synergy, MIFS and its variants may cause a big bias when features are combined to cooperate together. Besides, MIFS and its variants estimate feature redundancy regardless of the corresponding classification task. In this paper, we propose an automated greedy feature selection algorithm called conditional mutual information-based feature selection (CMIFS). Based on the link between interaction information and conditional mutual information, CMIFS takes account of both redundancy and synergy interactions of features and identifies discriminative features. In addition, CMIFS combines feature redundancy evaluation with classification tasks. It can decrease the probability of mistaking important features as redundant features in searching process. The experimental results show that CMIFS can achieve higher best-classification-accuracy than MIFS and its variants, with the same or less (nearly 50%) number of features.

Application of Bayesian Statistical Analysis to Multisource Data Integration

  • Hong, Sa-Hyun;Moon, Wooil-M.
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2002년도 Proceedings of International Symposium on Remote Sensing
    • /
    • pp.394-399
    • /
    • 2002
  • In this paper, Multisource data classification methods based on Bayesian formula are considered. For this decision fusion scheme, the individual data sources are handled separately by statistical classification algorithms and then Bayesian fusion method is applied to integrate from the available data sources. This method includes the combination of each expert decisions where the weights of the individual experts represent the reliability of the sources. The reliability measure used in the statistical approach is common to all pixels in previous work. In this experiment, the weight factors have been assigned to have different value for all pixels in order to improve the integrated classification accuracies. Although most implementations of Bayesian classification approaches assume fixed a priori probabilities, we have used adaptive a priori probabilities by iteratively calculating the local a priori probabilities so as to maximize the posteriori probabilities. The effectiveness of the proposed method is at first demonstrated on simulations with artificial and evaluated in terms of real-world data sets. As a result, we have shown that Bayesian statistical fusion scheme performs well on multispectral data classification.

  • PDF

A Implementation of Optimal Multiple Classification System using Data Mining for Genome Analysis

  • Jeong, Yu-Jeong;Choi, Gwang-Mi
    • 한국컴퓨터정보학회논문지
    • /
    • 제23권12호
    • /
    • pp.43-48
    • /
    • 2018
  • In this paper, more efficient classification result could be obtained by applying the combination of the Hidden Markov Model and SVM Model to HMSV algorithm gene expression data which simulated the stochastic flow of gene data and clustering it. In this paper, we verified the HMSV algorithm that combines independently learned algorithms. To prove that this paper is superior to other papers, we tested the sensitivity and specificity of the most commonly used classification criteria. As a result, the K-means is 71% and the SOM is 68%. The proposed HMSV algorithm is 85%. These results are stable and high. It can be seen that this is better classified than using a general classification algorithm. The algorithm proposed in this paper is a stochastic modeling of the generation process of the characteristics included in the signal, and a good recognition rate can be obtained with a small amount of calculation, so it will be useful to study the relationship with diseases by showing fast and effective performance improvement with an algorithm that clusters nodes by simulating the stochastic flow of Gene Data through data mining of BigData.

A Predictive Model to identify possible affected Bipolar disorder students using Naive Baye's, Random Forest and SVM machine learning techniques of data mining and Building a Sequential Deep Learning Model using Keras

  • Peerbasha, S.;Surputheen, M. Mohamed
    • International Journal of Computer Science & Network Security
    • /
    • 제21권5호
    • /
    • pp.267-274
    • /
    • 2021
  • Medical care practices include gathering a wide range of student data that are with manic episodes and depression which would assist the specialist with diagnosing a health condition of the students correctly. In this way, the instructors of the specific students will also identify those students and take care of them well. The data which we collected from the students could be straightforward indications seen by them. The artificial intelligence has been utilized with Naive Baye's classification, Random forest classification algorithm, SVM algorithm to characterize the datasets which we gathered to check whether the student is influenced by Bipolar illness or not. Performance analysis of the disease data for the algorithms used is calculated and compared. Also, a sequential deep learning model is builded using Keras. The consequences of the simulations show the efficacy of the grouping techniques on a dataset, just as the nature and complexity of the dataset utilized.

ConvXGB: A new deep learning model for classification problems based on CNN and XGBoost

  • Thongsuwan, Setthanun;Jaiyen, Saichon;Padcharoen, Anantachai;Agarwal, Praveen
    • Nuclear Engineering and Technology
    • /
    • 제53권2호
    • /
    • pp.522-531
    • /
    • 2021
  • We describe a new deep learning model - Convolutional eXtreme Gradient Boosting (ConvXGB) for classification problems based on convolutional neural nets and Chen et al.'s XGBoost. As well as image data, ConvXGB also supports the general classification problems, with a data preprocessing module. ConvXGB consists of several stacked convolutional layers to learn the features of the input and is able to learn features automatically, followed by XGBoost in the last layer for predicting the class labels. The ConvXGB model is simplified by reducing the number of parameters under appropriate conditions, since it is not necessary re-adjust the weight values in a back propagation cycle. Experiments on several data sets from UCL Repository, including images and general data sets, showed that our model handled the classification problems, for all the tested data sets, slightly better than CNN and XGBoost alone and was sometimes significantly better.

Vehicle Classification by Road Lane Detection and Model Fitting Using a Surveillance Camera

  • Shin, Wook-Sun;Song, Doo-Heon;Lee, Chang-Hun
    • Journal of Information Processing Systems
    • /
    • 제2권1호
    • /
    • pp.52-57
    • /
    • 2006
  • One of the important functions of an Intelligent Transportation System (ITS) is to classify vehicle types using a vision system. We propose a method using machine-learning algorithms for this classification problem with 3-D object model fitting. It is also necessary to detect road lanes from a fixed traffic surveillance camera in preparation for model fitting. We apply a background mask and line analysis algorithm based on statistical measures to Hough Transform (HT) in order to remove noise and false positive road lanes. The results show that this method is quite efficient in terms of quality.