• Title/Summary/Keyword: unsupervised model

Search Result 240, Processing Time 0.023 seconds

Anomaly Detection In Real Power Plant Vibration Data by MSCRED Base Model Improved By Subset Sampling Validation (Subset 샘플링 검증 기법을 활용한 MSCRED 모델 기반 발전소 진동 데이터의 이상 진단)

  • Hong, Su-Woong;Kwon, Jang-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.1
    • /
    • pp.31-38
    • /
    • 2022
  • This paper applies an expert independent unsupervised neural network learning-based multivariate time series data analysis model, MSCRED(Multi-Scale Convolutional Recurrent Encoder-Decoder), and to overcome the limitation, because the MCRED is based on Auto-encoder model, that train data must not to be contaminated, by using learning data sampling technique, called Subset Sampling Validation. By using the vibration data of power plant equipment that has been labeled, the classification performance of MSCRED is evaluated with the Anomaly Score in many cases, 1) the abnormal data is mixed with the training data 2) when the abnormal data is removed from the training data in case 1. Through this, this paper presents an expert-independent anomaly diagnosis framework that is strong against error data, and presents a concise and accurate solution in various fields of multivariate time series data.

Why Should I Ban You! : X-FDS (Explainable FDS) Model Based on Online Game Payment Log (X-FDS : 게임 결제 로그 기반 XAI적용 이상 거래탐지 모델 연구)

  • Lee, Young Hun;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.1
    • /
    • pp.25-38
    • /
    • 2022
  • With the diversification of payment methods and games, related financial accidents are causing serious problems for users and game companies. Recently, game companies have introduced an Fraud Detection System (FDS) for game payment systems to prevent financial incident. However, FDS is ineffective and cannot provide major evidence based on judgment results, as it requires constant change of detection patterns. In this paper, we analyze abnormal transactions among payment log data of real game companies to generate related features. One of the unsupervised learning models, Autoencoder, was used to build a model to detect abnormal transactions, which resulted in over 85% accuracy. Using X-FDS (Explainable FDS) with XAI-SHAP, we could understand that the variables with the highest explanation for anomaly detection were the amount of transaction, transaction medium, and the age of users. Based on X-FDS, we derive an improved detection model with an accuracy of 94% was finally derived by fine-tuning the importance of features that adversely affect the proposed model.

Graph-Based Word Sense Disambiguation Using Iterative Approach (반복적 기법을 사용한 그래프 기반 단어 모호성 해소)

  • Kang, Sangwoo
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.2
    • /
    • pp.102-110
    • /
    • 2017
  • Current word sense disambiguation techniques employ various machine learning-based methods. Various approaches have been proposed to address this problem, including the knowledge base approach. This approach defines the sense of an ambiguous word in accordance with knowledge base information with no training corpus. In unsupervised learning techniques that use a knowledge base approach, graph-based and similarity-based methods have been the main research areas. The graph-based method has the advantage of constructing a semantic graph that delineates all paths between different senses that an ambiguous word may have. However, unnecessary semantic paths may be introduced, thereby increasing the risk of errors. To solve this problem and construct a fine-grained graph, in this paper, we propose a model that iteratively constructs the graph while eliminating unnecessary nodes and edges, i.e., senses and semantic paths. The hybrid similarity estimation model was applied to estimate a more accurate sense in the constructed semantic graph. Because the proposed model uses BabelNet, a multilingual lexical knowledge base, the model is not limited to a specific language.

Hydrological Forecasting Based on Hybrid Neural Networks in a Small Watershed (중소하천유역에서 Hybrid Neural Networks에 의한 수문학적 예측)

  • Kim, Seong-Won;Lee, Sun-Tak;Jo, Jeong-Sik
    • Journal of Korea Water Resources Association
    • /
    • v.34 no.4
    • /
    • pp.303-316
    • /
    • 2001
  • In this study, Radial Basis Function(RBF) Neural Networks Model, a kind of Hybrid Neural Networks was applied to hydrological forecasting in a small watershed. RBF Neural Networks Model has four kinds of parameters in it and consists of unsupervised and supervised training patterns. And Gaussian Kernel Function(GKF) was used among many kinds of Radial Basis Functions(RBFs). K-Means clustering algorithm was applied to optimize centers and widths which ate the parameters of GKF. The parameters of RBF Neural Networks Model such as centers, widths weights and biases were determined by the training procedures of RBF Neural Networks Model. And, with these parameters the validation procedures of RBF Neural Networks Model were carried out. RBF Neural Networks Model was applied to Wi-Stream basin which is one of the IHP Representative basins in South Korea. 10 rainfall events were selected for training and validation of RBF Neural Networks Model. The results of RBF Neural Networks Model were compared with those of Elman Neural Networks(ENN) Model. ENN Model is composed of One Step Secant BackPropagation(OSSBP) and Resilient BackPropagation(RBP) algorithms. RBF Neural Networks shows better results than ENN Model. RBF Neural Networks Model spent less time for the training of model and can be easily used by the hydrologists with little background knowledge of RBF Neural Networks Model.

  • PDF

A Design on Informal Big Data Topic Extraction System Based on Spark Framework (Spark 프레임워크 기반 비정형 빅데이터 토픽 추출 시스템 설계)

  • Park, Kiejin
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.11
    • /
    • pp.521-526
    • /
    • 2016
  • As on-line informal text data have massive in its volume and have unstructured characteristics in nature, there are limitations in applying traditional relational data model technologies for data storage and data analysis jobs. Moreover, using dynamically generating massive social data, social user's real-time reaction analysis tasks is hard to accomplish. In the paper, to capture easily the semantics of massive and informal on-line documents with unsupervised learning mechanism, we design and implement automatic topic extraction systems according to the mass of the words that consists a document. The input data set to the proposed system are generated first, using N-gram algorithm to build multiple words to capture the meaning of the sentences precisely, and Hadoop and Spark (In-memory distributed computing framework) are adopted to run topic model. In the experiment phases, TB level input data are processed for data preprocessing and proposed topic extraction steps are applied. We conclude that the proposed system shows good performance in extracting meaningful topics in time as the intermediate results come from main memories directly instead of an HDD reading.

Gaussian mixture model for automated tracking of modal parameters of long-span bridge

  • Mao, Jian-Xiao;Wang, Hao;Spencer, Billie F. Jr.
    • Smart Structures and Systems
    • /
    • v.24 no.2
    • /
    • pp.243-256
    • /
    • 2019
  • Determination of the most meaningful structural modes and gaining insight into how these modes evolve are important issues for long-term structural health monitoring of the long-span bridges. To address this issue, modal parameters identified throughout the life of the bridge need to be compared and linked with each other, which is the process of mode tracking. The modal frequencies for a long-span bridge are typically closely-spaced, sensitive to the environment (e.g., temperature, wind, traffic, etc.), which makes the automated tracking of modal parameters a difficult process, often requiring human intervention. Machine learning methods are well-suited for uncovering complex underlying relationships between processes and thus have the potential to realize accurate and automated modal tracking. In this study, Gaussian mixture model (GMM), a popular unsupervised machine learning method, is employed to automatically determine and update baseline modal properties from the identified unlabeled modal parameters. On this foundation, a new mode tracking method is proposed for automated mode tracking for long-span bridges. Firstly, a numerical example for a three-degree-of-freedom system is employed to validate the feasibility of using GMM to automatically determine the baseline modal properties. Subsequently, the field monitoring data of a long-span bridge are utilized to illustrate the practical usage of GMM for automated determination of the baseline list. Finally, the continuously monitoring bridge acceleration data during strong typhoon events are employed to validate the reliability of proposed method in tracking the changing modal parameters. Results show that the proposed method can automatically track the modal parameters in disastrous scenarios and provide valuable references for condition assessment of the bridge structure.

Unsupervised one-class classification for condition assessment of bridge cables using Bayesian factor analysis

  • Wang, Xiaoyou;Li, Lingfang;Tian, Wei;Du, Yao;Hou, Rongrong;Xia, Yong
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.41-51
    • /
    • 2022
  • Cables are critical components of cable-stayed bridges. A structural health monitoring system provides real-time cable tension recording for cable health monitoring. However, the measurement data involve multiple sources of variability, i.e., varying environmental and operational factors, which increase the complexity of cable condition monitoring. In this study, a one-class classification method is developed for cable condition assessment using Bayesian factor analysis (FA). The single-peaked vehicle-induced cable tension is assumed to be relevant to vehicle positions and weights. The Bayesian FA is adopted to establish the correlation model between cable tensions and vehicles. Vehicle weights are assumed to be latent variables and the influences of different transverse positions are quantified by coefficient parameters. The Bayesian theorem is employed to estimate the parameters and variables automatically, and the damage index is defined on the basis of the well-trained model. The proposed method is applied to one cable-stayed bridge for cable damage detection. Significant deviations of the damage indices of Cable SJS11 were observed, indicating a damaged condition in 2011. This study develops a novel method to evaluate the health condition of individual cable using the FA in the Bayesian framework. Only vehicle-induced cable tensions are used and there is no need to monitor the vehicles. The entire process, including the data pre-processing, model training and damage index calculation of one cable, takes only 35 s, which is highly efficient.

Abnormal State Detection using Memory-augmented Autoencoder technique in Frequency-Time Domain

  • Haoyi Zhong;Yongjiang Zhao;Chang Gyoon Lim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.348-369
    • /
    • 2024
  • With the advancement of Industry 4.0 and Industrial Internet of Things (IIoT), manufacturing increasingly seeks automation and intelligence. Temperature and vibration monitoring are essential for machinery health. Traditional abnormal state detection methodologies often overlook the intricate frequency characteristics inherent in vibration time series and are susceptible to erroneously reconstructing temperature abnormalities due to the highly similar waveforms. To address these limitations, we introduce synergistic, end-to-end, unsupervised Frequency-Time Domain Memory-Enhanced Autoencoders (FTD-MAE) capable of identifying abnormalities in both temperature and vibration datasets. This model is adept at accommodating time series with variable frequency complexities and mitigates the risk of overgeneralization. Initially, the frequency domain encoder processes the spectrogram generated through Short-Time Fourier Transform (STFT), while the time domain encoder interprets the raw time series. This results in two disparate sets of latent representations. Subsequently, these are subjected to a memory mechanism and a limiting function, which numerically constrain each memory term. These processed terms are then amalgamated to create two unified, novel representations that the decoder leverages to produce reconstructed samples. Furthermore, the model employs Spectral Entropy to dynamically assess the frequency complexity of the time series, which, in turn, calibrates the weightage attributed to the loss functions of the individual branches, thereby generating definitive abnormal scores. Through extensive experiments, FTD-MAE achieved an average ACC and F1 of 0.9826 and 0.9808 on the CMHS and CWRU datasets, respectively. Compared to the best representative model, the ACC increased by 0.2114 and the F1 by 0.1876.

Autoencoder-Based Anomaly Detection Method for IoT Device Traffics (오토인코더 기반 IoT 디바이스 트래픽 이상징후 탐지 방법 연구)

  • Seung-A Park;Yejin Jang;Da Seul Kim;Mee Lan Han
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.2
    • /
    • pp.281-288
    • /
    • 2024
  • The sixth generation(6G) wireless communication technology is advancing toward ultra-high speed, ultra-high bandwidth, and hyper-connectivity. With the development of communication technologies, the formation of a hyper-connected society is rapidly accelerating, expanding from the IoT(Internet of Things) to the IoE(Internet of Everything). However, at the same time, security threats targeting IoT devices have become widespread, and there are concerns about security incidents such as unauthorized access and information leakage. As a result, the need for security-enhancing solutions is increasing. In this paper, we implement an autoencoder-based anomaly detection model utilizing real-time collected network traffics in respond to IoT security threats. Considering the difficulty of capturing IoT device traffic data for each attack in real IoT environments, we use an unsupervised learning-based autoencoder and implement 6 different autoencoder models based on the use of noise in the training data and the dimensions of the latent space. By comparing the model performance through experiments, we provide a performance evaluation of the anomaly detection model for detecting abnormal network traffic.

Numerical Studies on the Structural-health Evaluation of Subway Stations based on Statistical Pattern Recognition Techniques (패턴인식 기반 역사 구조건전성 평가기법 개발을 위한 수치해석 연구)

  • Shin, Jeong-Ryol;An, Tae-Ki;Lee, Chang-Gil;Park, Seung-Hee
    • Proceedings of the KSR Conference
    • /
    • 2011.05a
    • /
    • pp.1735-1741
    • /
    • 2011
  • The safety of station structures among railway infrastructures should be considered as a top priority because hundreds of thousands passengers a day take a subway. The station structures, which have been being operated since the 1970s, are especially vulnerable to the earthquake and long-term vibrations such as ambient train vibrations as well. This is why the structural-health monitoring system of station structures should be required. For these reason, Korean government has made an effort to develop the structural health-monitoring system of them, which can evaluate the health-state of station structures as well as can monitor the vulnerable structural members in real-time. Then, through the monitoring system, the vulnerable structural members could be retrofitted. For the development of health-state evaluation method for station structures with the real-time sensing data measured in the fields, authors carried out the numerical simulations to develop evaluation algorithms based on statistical pattern recognition techniques. In this study, the dynamic behavior of Chungmuro station in Seoul was numerically analyzed and then critical members were chosen. Damages were artificially simulated at the selected critical members of the numerical model. And, the supervised and unsupervised learning based pattern recognition algorithms were applied to quantify and localize the structural defects.

  • PDF