• 제목/요약/키워드: Unsupervised

검색결과 822건 처리시간 0.023초

비감독형 학습 기법을 사용한 심각도 기반 결함 예측 (Severity-based Fault Prediction using Unsupervised Learning)

  • 홍의석
    • 한국인터넷방송통신학회논문지
    • /
    • 제18권3호
    • /
    • pp.151-157
    • /
    • 2018
  • 소프트웨어 결함 예측에 관한 기존의 연구들은 대부분 모델의 입력 모듈이 결함을 가지고 있는지 여부를 판단하는 이진 감독형 분류 모델들에 관한 것들이었다. 하지만 이진 분류 모델은 결함의 복잡한 특성들을 고려하지 않고 단순히 입력 모듈의 결함 유무만을 판단한다는 문제점이 있고, 감독형 모델은 대부분의 개발 집단이 보유하고 있지 않은 훈련 데이터 집합을 필요로 한다는 한계점이 있다. 본 논문은 이러한 두 가지 문제점을 해결하기 위해 비감독형 알고리즘을 사용한 심각도 기반 삼진 분류 모델을 제안하였으며, 평가 실험 결과 제안 모델이 감독형 모델들에 필적하는 예측 성능을 보였다.

Unsupervised Incremental Learning of Associative Cubes with Orthogonal Kernels

  • Kang, Hoon;Ha, Joonsoo;Shin, Jangbeom;Lee, Hong Gi;Wang, Yang
    • 한국지능시스템학회논문지
    • /
    • 제25권1호
    • /
    • pp.97-104
    • /
    • 2015
  • An 'associative cube', a class of auto-associative memories, is revisited here, in which training data and hidden orthogonal basis functions such as wavelet packets or Fourier kernels, are combined in the weight cube. This weight cube has hidden units in its depth, represented by a three dimensional cubic structure. We develop an unsupervised incremental learning mechanism based upon the adaptive least squares method. Training data are mapped into orthogonal basis vectors in a least-squares sense by updating the weights which minimize an energy function. Therefore, a prescribed orthogonal kernel is incrementally assigned to an incoming data. Next, we show how a decoding procedure finds the closest one with a competitive network in the hidden layer. As noisy test data are applied to an associative cube, the nearest one among the original training data are restored in an optimal sense. The simulation results confirm robustness of associative cubes even if test data are heavily distorted by various types of noise.

비지도 학습을 기반으로 한 한국어 부사격의 의미역 결정 (Unsupervised Semantic Role Labeling for Korean Adverbial Case)

  • 김병수;이용훈;이종혁
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제34권2호
    • /
    • pp.112-122
    • /
    • 2007
  • 말뭉치를 이용하여 통계적으로 의미역 결정(semantic role labeling)을 하기 위해서는, 의미역을 태깅하는 작업이 필수적이다. 그러나 한국어의 경우 의미역이 태깅된 대량의 말뭉치를 구하기 힘들며, 이를 직접 구축하기 위해서는 많은 시간과 노력이 필요한 문제점이 있다. 본 논문에서는 비지도 학습의 하나인 self-training 알고리즘을 적용하여, 의미역이 태깅되지 않은 말뭉치로부터 의미역을 결정하는 방법을 제안한다. 이를 위해, 세종 용언 전자사전의 격틀 정보를 이용하여 자동으로 학습 말뭉치를 구축하였으며, 확률 모델을 적용하여 점진적으로 학습하였다. 그 결과, 4개의 부사격 조사에 대해 평균적으로 83.00%의 정확률을 보였다.

Bagging deep convolutional autoencoders trained with a mixture of real data and GAN-generated data

  • Hu, Cong;Wu, Xiao-Jun;Shu, Zhen-Qiu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권11호
    • /
    • pp.5427-5445
    • /
    • 2019
  • While deep neural networks have achieved remarkable performance in representation learning, a huge amount of labeled training data are usually required by supervised deep models such as convolutional neural networks. In this paper, we propose a new representation learning method, namely generative adversarial networks (GAN) based bagging deep convolutional autoencoders (GAN-BDCAE), which can map data to diverse hierarchical representations in an unsupervised fashion. To boost the size of training data, to train deep model and to aggregate diverse learning machines are the three principal avenues towards increasing the capabilities of representation learning of neural networks. We focus on combining those three techniques. To this aim, we adopt GAN for realistic unlabeled sample generation and bagging deep convolutional autoencoders (BDCAE) for robust feature learning. The proposed method improves the discriminative ability of learned feature embedding for solving subsequent pattern recognition problems. We evaluate our approach on three standard benchmarks and demonstrate the superiority of the proposed method compared to traditional unsupervised learning methods.

Feature Selection via Embedded Learning Based on Tangent Space Alignment for Microarray Data

  • Ye, Xiucai;Sakurai, Tetsuya
    • Journal of Computing Science and Engineering
    • /
    • 제11권4호
    • /
    • pp.121-129
    • /
    • 2017
  • Feature selection has been widely established as an efficient technique for microarray data analysis. Feature selection aims to search for the most important feature/gene subset of a given dataset according to its relevance to the current target. Unsupervised feature selection is considered to be challenging due to the lack of label information. In this paper, we propose a novel method for unsupervised feature selection, which incorporates embedded learning and $l_{2,1}-norm$ sparse regression into a framework to select genes in microarray data analysis. Local tangent space alignment is applied during embedded learning to preserve the local data structure. The $l_{2,1}-norm$ sparse regression acts as a constraint to aid in learning the gene weights correlatively, by which the proposed method optimizes for selecting the informative genes which better capture the interesting natural classes of samples. We provide an effective algorithm to solve the optimization problem in our method. Finally, to validate the efficacy of the proposed method, we evaluate the proposed method on real microarray gene expression datasets. The experimental results demonstrate that the proposed method obtains quite promising performance.

Bathymetric mapping in Dong-Sha Atoll using SPOT data

  • Huang, Shih-Jen;Wen, Yao-Chung
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume II
    • /
    • pp.525-528
    • /
    • 2006
  • The remote sensing data can be used to calculate the water depth especially in the clear and shallow water area. In this study, the SPOT data was used for bathymetric mapping in Dong-Sha atoll, located in northern South China Sea. The in situ sea depth was collected by echo sounder as well. A global positioning system was employed to locate the accurate sampling points for sea depth. An empirical model between measurement sea depth and band digital count was determined and based on least squares regression analysis. Both non-classification and unsupervised classification were used in this study. The results show that the standard error is less than 0.9m for non-classification. Besides, the 10% error related to the measurement water depth can be satisfied for more than 85% in situ data points. Otherwise, the 10% relative error can reach more than 97%, 69%, and 51% data points at class 4, 5, and 6 respectively if supervised classification is applied. Meanwhile, we also find that the unsupervised classification can get more accuracy to estimate water depth with standard error less than 0.63, 0.93, and 0.68m at class 4, 5, and 6 respectively.

  • PDF

교사학습과 비교사학습의 접목에 의한 두뇌방식의 지능 정보 처리 알고리즘 개발: 학습패턴의 생성 (Development of Brain-Style Intelligent Information Processing Algorithm Through the Merge of Supervised and Unsupervised Learning: Generation of Exemplar Patterns for Training)

  • 오상훈
    • 전자공학회논문지CI
    • /
    • 제41권6호
    • /
    • pp.61-67
    • /
    • 2004
  • 시간/경제적 문제 혹은 수집 대상의 제한으로 충분한 수의 학습패턴을 모을 수 없는 경우에 인간의 두뇌를 모방한 교사학습 및 비교사학습 모델을 이용하여 새로운 학습패턴을 생성하는 알고리즘을 제안하였다. 비교사학습은 독립성분분석을 사용하여 패턴의 특성을 분석 후 생성하며, 교사학습은 다층퍼셉트론 모델을 사용하여 생성된 패턴의 검증을 하는 단계로 적용되었다. 통계학적으로 이와 같은 형태의 패턴 생성을 분석하였으며, 필기체 숫자의 학습 패턴 수를 변동시키면서 패턴 생성의 효과를 시험패턴에 대한 오인식률로 확인한 결과 성능이 향상됨을 보였다.

비감독 학습 기법에 의한 한국어의 키워드 추출 (Keyword Extraction in Korean Using Unsupervised Learning Method)

  • 신성윤;이양원
    • 한국정보통신학회논문지
    • /
    • 제14권6호
    • /
    • pp.1403-1408
    • /
    • 2010
  • 한국어 정보검색에서는 문서를 대표하는 색인어 또는 키워드로서 명사를 사용하는데, 이러한 명사 및 키워드 추출이란 문서 내에 존재하는 모든 명사를 찾아내는 작업이다. 본 논문에서는 기 구축된 사전을 이용하여 키워드를 추출하는 방법을 제시한다. 이 방법은 불필요한 연산을 줄여서 수행 시간을 단축시켰다. 그리고 대용량의 문서에서도 정확도에 크게 영향을 미치지 않으면서 명사를 추출할 수 있다. 본 논문에서는 명사의 출현 특성을 이용한 명사추출 방법 및 비감독 학습 기법에 의한 키워드 추출 방법을 제시한다.

A Study on the Unsupervised Classification of Hyperion and ETM+ Data Using Spectral Angle and Unit Vector

  • Kim, Dae-Sung;Kim, Yong-Il;Yu, Ki-Yun
    • Korean Journal of Geomatics
    • /
    • 제5권1호
    • /
    • pp.27-34
    • /
    • 2005
  • Unsupervised classification is an important area of research in image processing because supervised classification has the disadvantages such as long task-training time and high cost and low objectivity in training information. This paper focuses on unsupervised classification, which can extract ground object information with the minimum 'Spectral Angle Distance' operation on be behalf of 'Spectral Euclidian Distance' in the clustering process. Unlike previous studies, our algorithm uses the unit vector, not the spectral distance, to compute the cluster mean, and the Single-Pass algorithm automatically determines the seed points. Atmospheric correction for more accurate results was adapted on the Hyperion data and the results were analyzed. We applied the algorithm to the Hyperion and ETM+ data and compared the results with K-Means and the former USAM algorithm. From the result, USAM classified the water and dark forest area well and gave more accurate results than K-Means, so we believe that the 'Spectral Angle' can be one of the most accurate classifiers of not only multispectral images but hyperspectral images. And also the unit vector can be an efficient technique for characterizing the Remote Sensing data.

  • PDF

비지도 학습 기법을 사용한 RF 위협의 분포 분석 (Analysis on the Distribution of RF Threats Using Unsupervised Learning Techniques)

  • 김철표;노상욱;박소령
    • 한국군사과학기술학회지
    • /
    • 제19권3호
    • /
    • pp.346-355
    • /
    • 2016
  • In this paper, we propose a method to analyze the clusters of RF threats emitting electrical signals based on collected signal variables in integrated electronic warfare environments. We first analyze the signal variables collected by an electronic warfare receiver, and construct a model based on variables showing the properties of threats. To visualize the distribution of RF threats and reversely identify them, we use k-means clustering algorithm and self-organizing map (SOM) algorithm, which are belonging to unsupervised learning techniques. Through the resulting model compiled by k-means clustering and SOM algorithms, the RF threats can be classified into one of the distribution of RF threats. In an experiment, we measure the accuracy of classification results using the algorithms, and verify the resulting model that could be used to visually recognize the distribution of RF threats.