• 제목/요약/키워드: semi-supervised

검색결과 172건 처리시간 0.021초

Semi-supervised regression based on support vector machine

  • Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권2호
    • /
    • pp.447-454
    • /
    • 2014
  • In many practical machine learning and data mining applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain. Therefore semi-supervised learning algorithms have attracted much attentions. However, previous research mainly focuses on classication problems. In this paper, a semi-supervised regression method based on support vector regression (SVR) formulation that is proposed. The estimator is easily obtained via the dual formulation of the optimization problem. The experimental results with simulated and real data suggest superior performance of the our proposed method compared with standard SVR.

Improve the Performance of Semi-Supervised Side-channel Analysis Using HWFilter Method

  • Hong Zhang;Lang Li;Di Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권3호
    • /
    • pp.738-754
    • /
    • 2024
  • Side-channel analysis (SCA) is a cryptanalytic technique that exploits physical leakages, such as power consumption or electromagnetic emanations, from cryptographic devices to extract secret keys used in cryptographic algorithms. Recent studies have shown that training SCA models with semi-supervised learning can effectively overcome the problem of few labeled power traces. However, the process of training SCA models using semi-supervised learning generates many pseudo-labels. The performance of the SCA model can be reduced by some of these pseudo-labels. To solve this issue, we propose the HWFilter method to improve semi-supervised SCA. This method uses a Hamming Weight Pseudo-label Filter (HWPF) to filter the pseudo-labels generated by the semi-supervised SCA model, which enhances the model's performance. Furthermore, we introduce a normal distribution method for constructing the HWPF. In the normal distribution method, the Hamming weights (HWs) of power traces can be obtained from the normal distribution of power points. These HWs are filtered and combined into a HWPF. The HWFilter was tested using the ASCADv1 database and the AES_HD dataset. The experimental results demonstrate that the HWFilter method can significantly enhance the performance of semi-supervised SCA models. In the ASCADv1 database, the model with HWFilter requires only 33 power traces to recover the key. In the AES_HD dataset, the model with HWFilter outperforms the current best semi-supervised SCA model by 12%.

제조 공정 결함 탐지를 위한 MixMatch 기반 준지도학습 성능 분석 (Performance Analysis of MixMatch-Based Semi-Supervised Learning for Defect Detection in Manufacturing Processes)

  • 김예준;정예은;김용수
    • 산업경영시스템학회지
    • /
    • 제46권4호
    • /
    • pp.312-320
    • /
    • 2023
  • Recently, there has been an increasing attempt to replace defect detection inspections in the manufacturing industry using deep learning techniques. However, obtaining substantial high-quality labeled data to enhance the performance of deep learning models entails economic and temporal constraints. As a solution for this problem, semi-supervised learning, using a limited amount of labeled data, has been gaining traction. This study assesses the effectiveness of semi-supervised learning in the defect detection process of manufacturing using the MixMatch algorithm. The MixMatch algorithm incorporates three dominant paradigms in the semi-supervised field: Consistency regularization, Entropy minimization, and Generic regularization. The performance of semi-supervised learning based on the MixMatch algorithm was compared with that of supervised learning using defect image data from the metal casting process. For the experiments, the ratio of labeled data was adjusted to 5%, 10%, 25%, and 50% of the total data. At a labeled data ratio of 5%, semi-supervised learning achieved a classification accuracy of 90.19%, outperforming supervised learning by approximately 22%p. At a 10% ratio, it surpassed supervised learning by around 8%p, achieving a 92.89% accuracy. These results demonstrate that semi-supervised learning can achieve significant outcomes even with a very limited amount of labeled data, suggesting its invaluable application in real-world research and industrial settings where labeled data is limited.

The use of support vector machines in semi-supervised classification

  • Bae, Hyunjoo;Kim, Hyungwoo;Shin, Seung Jun
    • Communications for Statistical Applications and Methods
    • /
    • 제29권2호
    • /
    • pp.193-202
    • /
    • 2022
  • Semi-supervised learning has gained significant attention in recent applications. In this article, we provide a selective overview of popular semi-supervised methods and then propose a simple but effective algorithm for semi-supervised classification using support vector machines (SVM), one of the most popular binary classifiers in a machine learning community. The idea is simple as follows. First, we apply the dimension reduction to the unlabeled observations and cluster them to assign labels on the reduced space. SVM is then employed to the combined set of labeled and unlabeled observations to construct a classification rule. The use of SVM enables us to extend it to the nonlinear counterpart via kernel trick. Our numerical experiments under various scenarios demonstrate that the proposed method is promising in semi-supervised classification.

스마트폰 로봇의 위치 인식을 위한 준 지도식 학습 기법 (Semi-supervised Learning for the Positioning of a Smartphone-based Robot)

  • 유재현;김현진
    • 제어로봇시스템학회논문지
    • /
    • 제21권6호
    • /
    • pp.565-570
    • /
    • 2015
  • Supervised machine learning has become popular in discovering context descriptions from sensor data. However, collecting a large amount of labeled training data in order to guarantee good performance requires a great deal of expense and time. For this reason, semi-supervised learning has recently been developed due to its superior performance despite using only a small number of labeled data. In the existing semi-supervised learning algorithms, unlabeled data are used to build a graph Laplacian in order to represent an intrinsic data geometry. In this paper, we represent the unlabeled data as the spatial-temporal dataset by considering smoothly moving objects over time and space. The developed algorithm is evaluated for position estimation of a smartphone-based robot. In comparison with other state-of-art semi-supervised learning, our algorithm performs more accurate location estimates.

Semi-Supervised Learning Using Kernel Estimation

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권3호
    • /
    • pp.629-636
    • /
    • 2007
  • A kernel type semi-supervised estimate is proposed. The proposed estimate is based on the penalized least squares loss and the principle of Gaussian Random Fields Model. As a result, we can estimate the label of new unlabeled data without re-computation of the algorithm that is different from the existing transductive semi-supervised learning. Also our estimate is viewed as a general form of Gaussian Random Fields Model. We give experimental evidence suggesting that our estimate is able to use unlabeled data effectively and yields good classification.

  • PDF

Study on semi-supervised local constant regression estimation

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권3호
    • /
    • pp.579-585
    • /
    • 2012
  • Many different semi-supervised learning algorithms have been proposed for use wit unlabeled data. However, most of them focus on classification problems. In this paper we propose a semi-supervised regression algorithm called the semi-supervised local constant estimator (SSLCE), based on the local constant estimator (LCE), and reveal the asymptotic properties of SSLCE. We also show that the SSLCE has a faster convergence rate than that of the LCE when a well chosen weighting factor is employed. Our experiment with synthetic data shows that the SSLCE can improve performance with unlabeled data, and we recommend its use with the proper size of unlabeled data.

준지도 커널능형회귀모형에 관한 연구 (A study on semi-supervised kernel ridge regression estimation)

  • 석경하
    • Journal of the Korean Data and Information Science Society
    • /
    • 제24권2호
    • /
    • pp.341-353
    • /
    • 2013
  • 데이터마이닝과 기계학습의 응용분야에서는 라벨 없는 자료를 이용하는 연구가 많이 진행되고 있다. 이러한 연구는 분류문제에 집중되었다가 최근에 회귀분석문제로 관심이 모아지고 있다. 본 연구에서는 커널능형회귀모형 형태의 준지도 회귀분석 방법을 제시한다. 제안된 방법은 기존의 전환적 방법과는 달리 라벨 없는 자료의 라벨을 추정하는 과정을 필요로 하지 않기 때문에 선택해야 할 모수의 수도 적고, 계산과정도 단순할 뿐 아니라 일반화에 강점이 있다. 모의실험과 실제 자료 분석을 통해 제안된 방법이 라벨 없는 자료를 잘 활용하여 라벨 있는 자료만 이용하는 방법보다 더 우수한 추정을 하는 것을 볼 수 있었다.

선별적인 임계값 선택을 이용한 준지도 학습의 SAR 분류 기술 (Semi-Supervised SAR Image Classification via Adaptive Threshold Selection)

  • 도재준;유민정;이재석;문효이;김선옥
    • 한국군사과학기술학회지
    • /
    • 제27권3호
    • /
    • pp.319-328
    • /
    • 2024
  • Semi-supervised learning is a good way to train a classification model using a small number of labeled and large number of unlabeled data. We applied semi-supervised learning to a synthetic aperture radar(SAR) image classification model with a limited number of datasets that are difficult to create. To address the previous difficulties, semi-supervised learning uses a model trained with a small amount of labeled data to generate and learn pseudo labels. Besides, a lot of number of papers use a single fixed threshold to create pseudo labels. In this paper, we present a semi-supervised synthetic aperture radar(SAR) image classification method that applies different thresholds for each class instead of all classes sharing a fixed threshold to improve SAR classification performance with a small number of labeled datasets.

준지도학습 기반 반도체 공정 이상 상태 감지 및 분류 (Semi-Supervised Learning for Fault Detection and Classification of Plasma Etch Equipment)

  • 이용호;최정은;홍상진
    • 반도체디스플레이기술학회지
    • /
    • 제19권4호
    • /
    • pp.121-125
    • /
    • 2020
  • With miniaturization of semiconductor, the manufacturing process become more complex, and undetected small changes in the state of the equipment have unexpectedly changed the process results. Fault detection classification (FDC) system that conducts more active data analysis is feasible to achieve more precise manufacturing process control with advanced machine learning method. However, applying machine learning, especially in supervised learning criteria, requires an arduous data labeling process for the construction of machine learning data. In this paper, we propose a semi-supervised learning to minimize the data labeling work for the data preprocessing. We employed equipment status variable identification (SVID) data and optical emission spectroscopy data (OES) in silicon etch with SF6/O2/Ar gas mixture, and the result shows as high as 95.2% of labeling accuracy with the suggested semi-supervised learning algorithm.