• 제목/요약/키워드: Supervised Data

검색결과 651건 처리시간 0.029초

머신러닝 알고리즘 분석 및 비교를 통한 Big-5 기반 성격 분석 연구 (A Study on Big-5 based Personality Analysis through Analysis and Comparison of Machine Learning Algorithm)

  • 김용준
    • 한국인터넷방송통신학회논문지
    • /
    • 제19권4호
    • /
    • pp.169-174
    • /
    • 2019
  • 본 연구에서는 설문지를 이용한 데이터 수집과 데이터 마이닝에서 클러스터링 기법으로 군집하여 지도학습을 이용하여 유사성을 판단하고, 성격들의 상관 관계의 적합성을 분석하기 위해 특징 추출 알고리즘들과 지도학습을 이용하는 것을 목표로 진행한다. 연구 수행은 설문조사를 진행 후 그 설문조사를 토대로 모인 데이터들을 정제하고, 오픈 소스 기반의 데이터 마이닝 도구인 WEKA의 클러스터링 기법들을 통해 데이터 세트를 분류하고 지도학습을 이용하여 유사성을 판단한다. 그리고 특징 추출 알고리즘들과 지도학습을 이용하여 성격에 대해 적합한 결과가 나오는지에 대한 적합성을 판단한다. 그 결과 유사성 판단에 가장 정확도 높게 도움을 주는 것은 EM 클러스터링으로 3개의 분류하고 Naïve Bayes 지도학습을 시킨 것이 가장 높은 유사성 분류 결과를 도출하였고, 적합성을 판단하는데 도움이 되도록 특징추출과 지도학습을 수행하였을 때, Big-5 각 성격마다 문항에 추가되고 삭제되는 것에 따라 정확도가 변하는 모습을 찾게 되었고, 각 성격 마다 차이에 대한 분석을 완료하였다.

준지도 학습에서 꼭지점 중요도를 고려한 레이블 추론 (A Label Inference Algorithm Considering Vertex Importance in Semi-Supervised Learning)

  • 오병화;양지훈;이현진
    • 정보과학회 논문지
    • /
    • 제42권12호
    • /
    • pp.1561-1567
    • /
    • 2015
  • 준지도 학습은 기계 학습의 한 분야로서, 레이블된 데이터와 레이블되지 않은 데이터 모두를 사용하여 모델을 학습함으로써 지도 학습에 비해 예측 정확도를 높일 수 있다. 최근 각광받고 있는 그래프 기반 준지도 학습은 입력 데이터를 그래프의 형태로 변환하는 그래프 구축 단계와 이를 사용하여 레이블되지 않은 데이터의 레이블을 예측하는 레이블 추론 단계로 나뉜다. 이 추론은 준지도 학습에서의 평활도 가정을 기본으로 한다. 본 연구에서는 추가로 각 꼭지점 중요도를 결합함으로써 개선된 레이블 추론 알고리즘을 제안한다. 이와 함께 알고리즘의 수렴성을 증명하고, 또한 실험을 통해 알고리즘의 우수성을 검증하였다.

선별적인 임계값 선택을 이용한 준지도 학습의 SAR 분류 기술 (Semi-Supervised SAR Image Classification via Adaptive Threshold Selection)

  • 도재준;유민정;이재석;문효이;김선옥
    • 한국군사과학기술학회지
    • /
    • 제27권3호
    • /
    • pp.319-328
    • /
    • 2024
  • Semi-supervised learning is a good way to train a classification model using a small number of labeled and large number of unlabeled data. We applied semi-supervised learning to a synthetic aperture radar(SAR) image classification model with a limited number of datasets that are difficult to create. To address the previous difficulties, semi-supervised learning uses a model trained with a small amount of labeled data to generate and learn pseudo labels. Besides, a lot of number of papers use a single fixed threshold to create pseudo labels. In this paper, we present a semi-supervised synthetic aperture radar(SAR) image classification method that applies different thresholds for each class instead of all classes sharing a fixed threshold to improve SAR classification performance with a small number of labeled datasets.

점진적 능동준지도 학습 기반 고효율 적응적 얼굴 표정 인식 (High Efficiency Adaptive Facial Expression Recognition based on Incremental Active Semi-Supervised Learning)

  • 김진우;이필규
    • 한국인터넷방송통신학회논문지
    • /
    • 제17권2호
    • /
    • pp.165-171
    • /
    • 2017
  • 사람의 얼굴 표정을 실제 환경에서 인식하는 데에는 여러 가지 난이한 점이 존재한다. 그래서 학습에 사용된 데이터베이스와 실험 데이터가 여러 가지 조건이 비슷할 때에만 그 성능이 높게 나온다. 이러한 문제점을 해결하려면 수많은 얼굴 표정 데이터가 필요하다. 본 논문에서는 능동준지도 학습을 통해 다양한 조건의 얼굴 표정 데이터를 쉽게 모으고 보다 빠르게 성능을 확보할 수 있는 방법을 제안한다. 제안하는 알고리즘은 딥러닝 네트워크와 능동 학습 (Active Learning)을 통해 초기 모델을 학습하고, 이후로는 준지도 학습(Semi-Supervised Learning)을 통해 라벨이 없는 추가 데이터를 확보하며, 성능이 확보될 때까지 이러한 과정을 반복한다. 위와 같은 능동준지도 학습(Active Semi-Supervised Learning)을 통해서 보다 적은 노동력으로 다양한 환경에 적합한 데이터를 확보하여 성능을 확보할 수 있다.

Semi-supervised learning using similarity and dissimilarity

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권1호
    • /
    • pp.99-105
    • /
    • 2011
  • We propose a semi-supervised learning algorithm based on a form of regularization that incorporates similarity and dissimilarity penalty terms. Our approach uses a graph-based encoding of similarity and dissimilarity. We also present a model-selection method which employs cross-validation techniques to choose hyperparameters which affect the performance of the proposed method. Simulations using two types of dat sets demonstrate that the proposed method is promising.

Sentiment Orientation Using Deep Learning Sequential and Bidirectional Models

  • Alyamani, Hasan J.
    • International Journal of Computer Science & Network Security
    • /
    • 제21권11호
    • /
    • pp.23-30
    • /
    • 2021
  • Sentiment Analysis has become very important field of research because posting of reviews is becoming a trend. Supervised, unsupervised and semi supervised machine learning methods done lot of work to mine this data. Feature engineering is complex and technical part of machine learning. Deep learning is a new trend, where this laborious work can be done automatically. Many researchers have done many works on Deep learning Convolutional Neural Network (CNN) and Long Shor Term Memory (LSTM) Neural Network. These requires high processing speed and memory. Here author suggested two models simple & bidirectional deep leaning, which can work on text data with normal processing speed. At end both models are compared and found bidirectional model is best, because simple model achieve 50% accuracy and bidirectional deep learning model achieve 99% accuracy on trained data while 78% accuracy on test data. But this is based on 10-epochs and 40-batch size. This accuracy can also be increased by making different attempts on epochs and batch size.

Machine Learning Techniques for Speech Recognition using the Magnitude

  • Krishnan, C. Gopala;Robinson, Y. Harold;Chilamkurti, Naveen
    • Journal of Multimedia Information System
    • /
    • 제7권1호
    • /
    • pp.33-40
    • /
    • 2020
  • Machine learning consists of supervised and unsupervised learning among which supervised learning is used for the speech recognition objectives. Supervised learning is the Data mining task of inferring a function from labeled training data. Speech recognition is the current trend that has gained focus over the decades. Most automation technologies use speech and speech recognition for various perspectives. This paper demonstrates an overview of major technological standpoint and gratitude of the elementary development of speech recognition and provides impression method has been developed in every stage of speech recognition using supervised learning. The project will use DNN to recognize speeches using magnitudes with large datasets.

A Hybrid Selection Method of Helpful Unlabeled Data Applicable for Semi-Supervised Learning Algorithm

  • Le, Thanh-Binh;Kim, Sang-Woon
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제3권4호
    • /
    • pp.234-239
    • /
    • 2014
  • This paper presents an empirical study on selecting a small amount of useful unlabeled data to improve the classification accuracy of semi-supervised learning algorithms. In particular, a hybrid method of unifying the simply recycled selection method and the incrementally-reinforced selection method was considered and evaluated empirically. The experimental results, which were obtained from well-known benchmark data sets using semi-supervised support vector machines, demonstrated that the hybrid method works better than the traditional ones in terms of the classification accuracy.

Input Variable Importance in Supervised Learning Models

  • Huh, Myung-Hoe;Lee, Yong Goo
    • Communications for Statistical Applications and Methods
    • /
    • 제10권1호
    • /
    • pp.239-246
    • /
    • 2003
  • Statisticians, or data miners, are often requested to assess the importances of input variables in the given supervised learning model. For the purpose, one may rely on separate ad hoc measures depending on modeling types, such as linear regressions, the neural networks or trees. Consequently, the conceptual consistency in input variable importance measures is lacking, so that the measures cannot be directly used in comparing different types of models, which is often done in data mining processes, In this short communication, we propose a unified approach to the importance measurement of input variables. Our method uses sensitivity analysis which begins by perturbing the values of input variables and monitors the output change. Research scope is limited to the models for continuous output, although it is not difficult to extend the method to supervised learning models for categorical outcomes.

지능형 교육 시스템의 학습자 분류를 위한 Variational Auto-Encoder 기반 준지도학습 기법 (Variational Auto-Encoder Based Semi-supervised Learning Scheme for Learner Classification in Intelligent Tutoring System)

  • 정승원;손민재;황인준
    • 한국멀티미디어학회논문지
    • /
    • 제22권11호
    • /
    • pp.1251-1258
    • /
    • 2019
  • Intelligent tutoring system enables users to effectively learn by utilizing various artificial intelligence techniques. For instance, it can recommend a proper curriculum or learning method to individual users based on their learning history. To do this effectively, user's characteristics need to be analyzed and classified based on various aspects such as interest, learning ability, and personality. Even though data labeled by the characteristics are required for more accurate classification, it is not easy to acquire enough amount of labeled data due to the labeling cost. On the other hand, unlabeled data should not need labeling process to make a large number of unlabeled data be collected and utilized. In this paper, we propose a semi-supervised learning method based on feedback variational auto-encoder(FVAE), which uses both labeled data and unlabeled data. FVAE is a variation of variational auto-encoder(VAE), where a multi-layer perceptron is added for giving feedback. Using unlabeled data, we train FVAE and fetch the encoder of FVAE. And then, we extract features from labeled data by using the encoder and train classifiers with the extracted features. In the experiments, we proved that FVAE-based semi-supervised learning was superior to VAE-based method in terms with accuracy and F1 score.