• Title/Summary/Keyword: unsupervised model

Search Result 239, Processing Time 0.022 seconds

Unsupervised Speaker Adaptation Based on Sufficient HMM Statistics (SUFFICIENT HMM 통계치에 기반한 UNSUPERVISED 화자 적응)

  • Ko Bong-Ok;Kim Chong-Kyo
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.127-130
    • /
    • 2003
  • This paper describes an efficient method for unsupervised speaker adaptation. This method is based on selecting a subset of speakers who are acoustically close to a test speaker, and calculating adapted model parameters according to the previously stored sufficient HMM statistics of the selected speakers' data. In this method, only a few unsupervised test speaker's data are required for the adaptation. Also, by using the sufficient HMM statistics of the selected speakers' data, a quick adaptation can be done. Compared with a pre-clustering method, the proposed method can obtain a more optimal speaker cluster because the clustering result is determined according to test speaker's data on-line. Experiment results show that the proposed method attains better improvement than MLLR from the speaker independent model. Moreover the proposed method utilizes only one unsupervised sentence utterance, while MLLR usually utilizes more than ten supervised sentence utterances.

  • PDF

Unsupervised Learning Model for Fault Prediction Using Representative Clustering Algorithms (대표적인 클러스터링 알고리즘을 사용한 비감독형 결함 예측 모델)

  • Hong, Euyseok;Park, Mikyeong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.2
    • /
    • pp.57-64
    • /
    • 2014
  • Most previous studies of software fault prediction model which determines the fault-proneness of input modules have focused on supervised learning model using training data set. However, Unsupervised learning model is needed in case supervised learning model cannot be applied: either past training data set is not present or even though there exists data set, current project type is changed. Building an unsupervised learning model is extremely difficult that is why only a few studies exist. In this paper, we build unsupervised models using representative clustering algorithms, EM and DBSCAN, that have not been used in prior studies and compare these models with the previous model using K-means algorithm. The results of our study show that the EM model performs slightly better than the K-means model in terms of error rate and these two models significantly outperform the DBSCAN model.

Knowledge Distillation for Unsupervised Depth Estimation (비지도학습 기반의 뎁스 추정을 위한 지식 증류 기법)

  • Song, Jimin;Lee, Sang Jun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.4
    • /
    • pp.209-215
    • /
    • 2022
  • This paper proposes a novel approach for training an unsupervised depth estimation algorithm. The objective of unsupervised depth estimation is to estimate pixel-wise distances from camera without external supervision. While most previous works focus on model architectures, loss functions, and masking methods for considering dynamic objects, this paper focuses on the training framework to effectively use depth cue. The main loss function of unsupervised depth estimation algorithms is known as the photometric error. In this paper, we claim that direct depth cue is more effective than the photometric error. To obtain the direct depth cue, we adopt the technique of knowledge distillation which is a teacher-student learning framework. We train a teacher network based on a previous unsupervised method, and its depth predictions are utilized as pseudo labels. The pseudo labels are employed to train a student network. In experiments, our proposed algorithm shows a comparable performance with the state-of-the-art algorithm, and we demonstrate that our teacher-student framework is effective in the problem of unsupervised depth estimation.

Improved Algorithm for Fully-automated Neural Spike Sorting based on Projection Pursuit and Gaussian Mixture Model

  • Kim, Kyung-Hwan
    • International Journal of Control, Automation, and Systems
    • /
    • v.4 no.6
    • /
    • pp.705-713
    • /
    • 2006
  • For the analysis of multiunit extracellular neural signals as multiple spike trains, neural spike sorting is essential. Existing algorithms for the spike sorting have been unsatisfactory when the signal-to-noise ratio(SNR) is low, especially for implementation of fully-automated systems. We present a novel method that shows satisfactory performance even under low SNR, and compare its performance with a recent method based on principal component analysis(PCA) and fuzzy c-means(FCM) clustering algorithm. Our system consists of a spike detector that shows high performance under low SNR, a feature extractor that utilizes projection pursuit based on negentropy maximization, and an unsupervised classifier based on Gaussian mixture model. It is shown that the proposed feature extractor gives better performance compared to the PCA, and the proposed combination of spike detector, feature extraction, and unsupervised classification yields much better performance than the PCA-FCM, in that the realization of fully-automated unsupervised spike sorting becomes more feasible.

Unsupervised Learning-Based Pipe Leak Detection using Deep Auto-Encoder

  • Yeo, Doyeob;Bae, Ji-Hoon;Lee, Jae-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.9
    • /
    • pp.21-27
    • /
    • 2019
  • In this paper, we propose a deep auto-encoder-based pipe leak detection (PLD) technique from time-series acoustic data collected by microphone sensor nodes. The key idea of the proposed technique is to learn representative features of the leak-free state using leak-free time-series acoustic data and the deep auto-encoder. The proposed technique can be used to create a PLD model that detects leaks in the pipeline in an unsupervised learning manner. This means that we only use leak-free data without labeling while training the deep auto-encoder. In addition, when compared to the previous supervised learning-based PLD method that uses image features, this technique does not require complex preprocessing of time-series acoustic data owing to the unsupervised feature extraction scheme. The experimental results show that the proposed PLD method using the deep auto-encoder can provide reliable PLD accuracy even considering unsupervised learning-based feature extraction.

Detecting Anomalies in Time-Series Data using Unsupervised Learning and Analysis on Infrequent Signatures

  • Bian, Xingchao
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.1011-1016
    • /
    • 2020
  • We propose a framework called Stacked Gated Recurrent Unit - Infrequent Residual Analysis (SG-IRA) that detects anomalies in time-series data that can be trained on streams of raw sensor data without any pre-labeled dataset. To enable such unsupervised learning, SG-IRA includes an estimation model that uses a stacked Gated Recurrent Unit (GRU) structure and an analysis method that detects anomalies based on the difference between the estimated value and the actual measurement (residual). SG-IRA's residual analysis method dynamically adapts the detection threshold from the population using frequency analysis, unlike the baseline model that relies on a constant threshold. In this paper, SG-IRA is evaluated using the industrial control systems (ICS) datasets. SG-IRA improves the detection performance (F1 score) by 5.9% compared to the baseline model.

Anomaly Detection in Sensor Data

  • Kim, Jong-Min;Baik, Jaiwook
    • Journal of Applied Reliability
    • /
    • v.18 no.1
    • /
    • pp.20-32
    • /
    • 2018
  • Purpose: The purpose of this study is to set up an anomaly detection criteria for sensor data coming from a motorcycle. Methods: Five sensor values for accelerator pedal, engine rpm, transmission rpm, gear and speed are obtained every 0.02 second from a motorcycle. Exploratory data analysis is used to find any pattern in the data. Traditional process control methods such as X control chart and time series models are fitted to find any anomaly behavior in the data. Finally unsupervised learning algorithm such as k-means clustering is used to find any anomaly spot in the sensor data. Results: According to exploratory data analysis, the distribution of accelerator pedal sensor values is very much skewed to the left. The motorcycle seemed to have been driven in a city at speed less than 45 kilometers per hour. Traditional process control charts such as X control chart fail due to severe autocorrelation in each sensor data. However, ARIMA model found three abnormal points where they are beyond 2 sigma limits in the control chart. We applied a copula based Markov chain to perform statistical process control for correlated observations. Copula based Markov model found anomaly behavior in the similar places as ARIMA model. In an unsupervised learning algorithm, large sensor values get subdivided into two, three, and four disjoint regions. So extreme sensor values are the ones that need to be tracked down for any sign of anomaly behavior in the sensor values. Conclusion: Exploratory data analysis is useful to find any pattern in the sensor data. Process control chart using ARIMA and Joe's copula based Markov model also give warnings near similar places in the data. Unsupervised learning algorithm shows us that the extreme sensor values are the ones that need to be tracked down for any sign of anomaly behavior.

Severity-based Fault Prediction using Unsupervised Learning (비감독형 학습 기법을 사용한 심각도 기반 결함 예측)

  • Hong, Euyseok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.3
    • /
    • pp.151-157
    • /
    • 2018
  • Most previous studies of software fault prediction have focused on supervised learning models for binary classification that determines whether an input module has faults or not. However, binary classification model determines only the presence or absence of faults in the module without considering the complex characteristics of the fault, and supervised model has the limitation that it requires a training data set that most development groups do not have. To solve these two problems, this paper proposes severity-based ternary classification model using unsupervised learning algorithms, and experimental results show that the proposed model has comparable performance to the supervised models.

A New Application of Unsupervised Learning to Nighttime Sea Fog Detection

  • Shin, Daegeun;Kim, Jae-Hwan
    • Asia-Pacific Journal of Atmospheric Sciences
    • /
    • v.54 no.4
    • /
    • pp.527-544
    • /
    • 2018
  • This paper presents a nighttime sea fog detection algorithm incorporating unsupervised learning technique. The algorithm is based on data sets that combine brightness temperatures from the $3.7{\mu}m$ and $10.8{\mu}m$ channels of the meteorological imager (MI) onboard the Communication, Ocean and Meteorological Satellite (COMS), with sea surface temperature from the Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA). Previous algorithms generally employed threshold values including the brightness temperature difference between the near infrared and infrared. The threshold values were previously determined from climatological analysis or model simulation. Although this method using predetermined thresholds is very simple and effective in detecting low cloud, it has difficulty in distinguishing fog from stratus because they share similar characteristics of particle size and altitude. In order to improve this, the unsupervised learning approach, which allows a more effective interpretation from the insufficient information, has been utilized. The unsupervised learning method employed in this paper is the expectation-maximization (EM) algorithm that is widely used in incomplete data problems. It identifies distinguishing features of the data by organizing and optimizing the data. This allows for the application of optimal threshold values for fog detection by considering the characteristics of a specific domain. The algorithm has been evaluated using the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) vertical profile products, which showed promising results within a local domain with probability of detection (POD) of 0.753 and critical success index (CSI) of 0.477, respectively.

Language Model Adaptation Based on Topic Probability of Latent Dirichlet Allocation

  • Jeon, Hyung-Bae;Lee, Soo-Young
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.487-493
    • /
    • 2016
  • Two new methods are proposed for an unsupervised adaptation of a language model (LM) with a single sentence for automatic transcription tasks. At the training phase, training documents are clustered by a method known as Latent Dirichlet allocation (LDA), and then a domain-specific LM is trained for each cluster. At the test phase, an adapted LM is presented as a linear mixture of the now trained domain-specific LMs. Unlike previous adaptation methods, the proposed methods fully utilize a trained LDA model for the estimation of weight values, which are then to be assigned to the now trained domain-specific LMs; therefore, the clustering and weight-estimation algorithms of the trained LDA model are reliable. For the continuous speech recognition benchmark tests, the proposed methods outperform other unsupervised LM adaptation methods based on latent semantic analysis, non-negative matrix factorization, and LDA with n-gram counting.