• Title/Summary/Keyword: Supervised LDA

Search Result 10, Processing Time 0.018 seconds

PCA-based Feature Extraction using Class Information (클래스 정보를 이용한 PCA 기반의 특징 추출)

  • Park, Myoung-Soo;Na, Jin-Hee;Choi, Jin-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.4
    • /
    • pp.492-497
    • /
    • 2005
  • Feature extraction is important to classify data with large dimension such as image data. The representative feature extraction methods lot feature extraction ate PCA, ICA, LDA and MLP, etc. These algorithms can be classified in two groups: unsupervised algorithms such as PCA, LDA, and supervised algorithms such as LDA, MLP. Among these two groups, supervised algorithms are more suitable to extract the features for classification because of the class information of input data. In this paper we suggest a new feature extraction algorithm PCA-FX which uses class information with PCA to extract ieatures for classification. We test our algorithm using Yale face database and compare the performance of proposed algorithm with those of other algorithms.

A Semi-supervised Dimension Reduction Method Using Ensemble Approach (앙상블 접근법을 이용한 반감독 차원 감소 방법)

  • Park, Cheong-Hee
    • The KIPS Transactions:PartD
    • /
    • v.19D no.2
    • /
    • pp.147-150
    • /
    • 2012
  • While LDA is a supervised dimension reduction method which finds projective directions to maximize separability between classes, the performance of LDA is severely degraded when the number of labeled data is small. Recently semi-supervised dimension reduction methods have been proposed which utilize abundant unlabeled data and overcome the shortage of labeled data. However, matrix computation usually used in statistical dimension reduction methods becomes hindrance to make the utilization of a large number of unlabeled data difficult, and moreover too much information from unlabeled data may not so helpful compared to the increase of its processing time. In order to solve these problems, we propose an ensemble approach for semi-supervised dimension reduction. Extensive experimental results in text classification demonstrates the effectiveness of the proposed method.

Generative probabilistic model with Dirichlet prior distribution for similarity analysis of research topic

  • Milyahilu, John;Kim, Jong Nam
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.4
    • /
    • pp.595-602
    • /
    • 2020
  • We propose a generative probabilistic model with Dirichlet prior distribution for topic modeling and text similarity analysis. It assigns a topic and calculates text correlation between documents within a corpus. It also provides posterior probabilities that are assigned to each topic of a document based on the prior distribution in the corpus. We then present a Gibbs sampling algorithm for inference about the posterior distribution and compute text correlation among 50 abstracts from the papers published by IEEE. We also conduct a supervised learning to set a benchmark that justifies the performance of the LDA (Latent Dirichlet Allocation). The experiments show that the accuracy for topic assignment to a certain document is 76% for LDA. The results for supervised learning show the accuracy of 61%, the precision of 93% and the f1-score of 96%. A discussion for experimental results indicates a thorough justification based on probabilities, distributions, evaluation metrics and correlation coefficients with respect to topic assignment.

A Study on Feature Projection Methods for a Real-Time EMG Pattern Recognition (실시간 근전도 패턴인식을 위한 특징투영 기법에 관한 연구)

  • Chu, Jun-Uk;Kim, Shin-Ki;Mun, Mu-Seong;Moon, In-Hyuk
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.12 no.9
    • /
    • pp.935-944
    • /
    • 2006
  • EMG pattern recognition is essential for the control of a multifunction myoelectric hand. The main goal of this study is to develop an efficient feature projection method for EMC pattern recognition. To this end, we propose a linear supervised feature projection that utilizes linear discriminant analysis (LDA). We first perform wavelet packet transform (WPT) to extract the feature vector from four channel EMC signals. For dimensionality reduction and clustering of the WPT features, the LDA incorporates class information into the learning procedure, and finds a linear matrix to maximize the class separability for the projected features. Finally, the multilayer perceptron classifies the LDA-reduced features into nine hand motions. To evaluate the performance of LDA for the WPT features, we compare LDA with three other feature projection methods. From a visualization and quantitative comparison, we show that LDA has better performance for the class separability, and the LDA-projected features improve the classification accuracy with a short processing time. We implemented a real-time pattern recognition system for a multifunction myoelectric hand. In experiment, we show that the proposed method achieves 97.2% recognition accuracy, and that all processes, including the generation of control commands for myoelectric hand, are completed within 97 msec. These results confirm that our method is applicable to real-time EMG pattern recognition far myoelectric hand control.

A Comparison Study of Classification Algorithms in Data Mining

  • Lee, Seung-Joo;Jun, Sung-Rae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.1
    • /
    • pp.1-5
    • /
    • 2008
  • Generally the analytical tools of data mining have two learning types which are supervised and unsupervised learning algorithms. Classification and prediction are main analysis tools for supervised learning. In this paper, we perform a comparison study of classification algorithms in data mining. We make comparative studies between popular classification algorithms which are LDA, QDA, kernel method, K-nearest neighbor, naive Bayesian, SVM, and CART. Also, we use almost all classification data sets of UCI machine learning repository for our experiments. According to our results, we are able to select proper algorithms for given classification data sets.

Semi-Supervised Answer Type Classification For Question-Answering System (질의 응답 시스템을 위한 반교사 기반의 정답 유형 분류)

  • Park, Seonyeong;Lee, Donghyeon;Kim, Yonghee;Ryu, Seonghan;Lee, Gary Geunbae
    • Annual Conference on Human and Language Technology
    • /
    • 2013.10a
    • /
    • pp.45-49
    • /
    • 2013
  • 기존 연구에서는 질의 응답 시스템에서 정답 유형을 분류하기 위해 패턴 매칭 방식이나 교사 학습(Supervised Learning)을 이용했다. 패턴 매칭 방식은 질의 분석을 통해 수동으로 패턴을 구축해야 한다. 교사 학습에서는 훈련 데이터 전체에 정답 유형이 태깅(Tagging)되어야 하며, 이를 위해서는 사용자의 질의에 정답 유형을 수동으로 태깅하는 작업이 많이 필요하다. 웹을 통해 정답 유형이 태깅되지 않은 대용량의 사용자 질의 말뭉치를 구할 수 있지만, 이 데이터에는 정답 유형이 태깅되어 있지 않다. 따라서, 대용량의 사용자 질의에 비례하여, 정답 유형을 수동으로 태깅하는 작업량이 증가한다. 앞서 언급한 두 가지 방법론에서, 정답 유형 분류를 위해 수작업이 많이 필요하다는 문제점을 해결하고자 본 논문에서는 일부 태깅된 훈련 데이터를 필요로 하는 반교사 학습(Semi-supervised Learning)에 기반한 정답 유형 분류를 제안한다. 이는 정답 유형 분류 작업에 필요한 노동력을 최소화함으로 대용량의 데이터를 통한 효율적 질의 응답 시스템 구축을 가능하게 한다.

  • PDF

Real-Time Face Recognition Based on Subspace and LVQ Classifier (부분공간과 LVQ 분류기에 기반한 실시간 얼굴 인식)

  • Kwon, Oh-Ryun;Min, Kyong-Pil;Chun, Jun-Chul
    • Journal of Internet Computing and Services
    • /
    • v.8 no.3
    • /
    • pp.19-32
    • /
    • 2007
  • This paper present a new face recognition method based on LVQ neural net to construct a real time face recognition system. The previous researches which used PCA, LDA combined neural net usually need much time in training neural net. The supervised LVQ neural net needs much less time in training and can maximize the separability between the classes. In this paper, the proposed method transforms the input face image by PCA and LDA sequentially into low-dimension feature vectors and recognizes the face through LVQ neural net. In order to make the system robust to external light variation, light compensation is performed on the detected face by max-min normalization method as preprocessing. PCA and LDA transformations are applied to the normalized face image to produce low-level feature vectors of the image. In order to determine the initial centers of LVQ and speed up the convergency of the LVQ neural net, the K-Means clustering algorithm is adopted. Subsequently, the class representative vectors can be produced by LVQ2 training using initial center vectors. The face recognition is achieved by using the euclidean distance measure between the center vector of classes and the feature vector of input image. From the experiments, we can prove that the proposed method is more effective in the recognition ratio for the cases of still images from ORL database and sequential images rather than using conventional PCA of a hybrid method with PCA and LDA.

  • PDF

User Needs-Based Technology Opportunities in Heterogeneous Fields Using Opinion Mining and Patent Analysis (오피니언 마이닝 및 특허분석을 통한 사용자 니즈기반 이종영역 기술기회 탐색)

  • Jang, Hyejin;Roh, Taeyeoun;Yoon, Byungun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.43 no.1
    • /
    • pp.39-48
    • /
    • 2017
  • In a digital economy, users actively express their needs in many ways. Thus, many researchers analyze what users need and whether they are satisfied or not through opinion mining. In addition, they begin to find technology opportunities in heterogeneous technology fields. But they did not connect users' opinion to technology development process, only focused on natural language processing or marketing or manufacturing area. Also, heterogeneous technology fields are focused on fusion technology. Thus, this study suggests a novel approach that is based on sentimental value and can be applied to exploring technology opportunities in heterogeneous fields. Sentimental value is calculated from users' opinion through sLDA. The heterogeneous technology opportunity is explored by patent analysis. This research contributes to suggesting a hybrid methodology through patent and users' opinion. In addition, it can provide managerial efficiency by suggesting base data onto decision making.

Research Topics in Industrial Engineering 2001~2015 (국내 산업공학 연구 주제 2001~2015)

  • Jeong, Bokwon;Lee, Hakyeon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.42 no.6
    • /
    • pp.421-431
    • /
    • 2016
  • Over the last four decades, industrial engineering (IE) research in Korea has continued to evolve and expand to respond to social needs. This paper aims to identify research topics in IE research and explore their dynamic changes over time. The topic modeling approach, which automatically discovers topics that pervade a large and unstructured collection of documents, is adopted to identify research topics in domestic IE research. 1,242 articles published from 2001 to 2015 in two IE journals issued by the Korean Institute of Industrial Engineers were collected and their English abstracts were analyzed. Applying the Latent Dirichlet Allocation model led us to uncover 50 topics of domestic IE research. The top 10 most popular topics are revealed, and topic trends are explored by examining the dynamic changes over time. The four topics, technology management, financial engineering, data mining (supervised learning), efficiency analysis, are selected as hot topics while several traditional topics related with manufacturing are revealed as cold topics. The findings are expected to provide fruitful implications for IE researchers.

Detection of Clavibacter michiganensis subsp. michiganensis Assisted by Micro-Raman Spectroscopy under Laboratory Conditions

  • Perez, Moises Roberto Vallejo;Contreras, Hugo Ricardo Navarro;Herrera, Jesus A. Sosa;Avila, Jose Pablo Lara;Tobias, Hugo Magdaleno Ramirez;Martinez, Fernando Diaz-Barriga;Ramirez, Rogelio Flores;Vazquez, Angel Gabriel Rodriguez
    • The Plant Pathology Journal
    • /
    • v.34 no.5
    • /
    • pp.381-392
    • /
    • 2018
  • Clavibacter michiganensis subsp. michiganesis (Cmm) is a quarantine-worthy pest in $M{\acute{e}}xico$. The implementation and validation of new technologies is necessary to reduce the time for bacterial detection in laboratory conditions and Raman spectroscopy is an ambitious technology that has all of the features needed to characterize and identify bacteria. Under controlled conditions a contagion process was induced with Cmm, the disease epidemiology was monitored. Micro-Raman spectroscopy ($532nm\;{\lambda}$ laser) technique was evaluated its performance at assisting on Cmm detection through its characteristic Raman spectrum fingerprint. Our experiment was conducted with tomato plants in a completely randomized block experimental design (13 plants ${\times}$ 4 rows). The Cmm infection was confirmed by 16S rDNA and plants showed symptoms from 48 to 72 h after inoculation, the evolution of the incidence and severity on plant population varied over time and it kept an aggregated spatial pattern. The contagion process reached 79% just 24 days after the epidemic was induced. Micro-Raman spectroscopy proved its speed, efficiency and usefulness as a non-destructive method for the preliminary detection of Cmm. Carotenoid specific bands with wavelengths at 1146 and $1510cm^{-1}$ were the distinguishable markers. Chemometric analyses showed the best performance by the implementation of PCA-LDA supervised classification algorithms applied over Raman spectrum data with 100% of performance in metrics of classifiers (sensitivity, specificity, accuracy, negative and positive predictive value) that allowed us to differentiate Cmm from other endophytic bacteria (Bacillus and Pantoea). The unsupervised KMeans algorithm showed good performance (100, 96, 98, 91 y 100%, respectively).