• Title/Summary/Keyword: The time-series data

Search Result 3,672, Processing Time 0.03 seconds

Time-Series based Dataset Selection Method for Effective Text Classification (효율적인 문헌 분류를 위한 시계열 기반 데이터 집합 선정 기법)

  • Chae, Yeonghun;Jeong, Do-Heon
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.1
    • /
    • pp.39-49
    • /
    • 2017
  • As the Internet technology advances, data on the web is increasing sharply. Many research study about incremental learning for classifying effectively in data increasing. Web document contains the time-series data such as published date. If we reflect time-series data to classification, it will be an effective classification. In this study, we analyze the time-series variation of the words. We propose an efficient classification through dividing the dataset based on the analysis of time-series information. For experiment, we corrected 1 million online news articles including time-series information. We divide the dataset and classify the dataset using SVM and $Na{\ddot{i}}ve$ Bayes. In each model, we show that classification performance is increasing. Through this study, we showed that reflecting time-series information can improve the classification performance.

Prediction of arrhythmia using multivariate time series data (다변량 시계열 자료를 이용한 부정맥 예측)

  • Lee, Minhai;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.671-681
    • /
    • 2019
  • Studies on predicting arrhythmia using machine learning have been actively conducted with increasing number of arrhythmia patients. Existing studies have predicted arrhythmia based on multivariate data of feature variables extracted from RR interval data at a specific time point. In this study, we consider that the pattern of the heart state changes with time can be important information for the arrhythmia prediction. Therefore, we investigate the usefulness of predicting the arrhythmia with multivariate time series data obtained by extracting and accumulating the multivariate vectors of the feature variables at various time points. When considering 1-nearest neighbor classification method and its ensemble for comparison, it is confirmed that the multivariate time series data based method can have better classification performance than the multivariate data based method if we select an appropriate time series distance function.

A Study on the Health Index Based on Degradation Patterns in Time Series Data Using ProphetNet Model (ProphetNet 모델을 활용한 시계열 데이터의 열화 패턴 기반 Health Index 연구)

  • Sun-Ju Won;Yong Soo Kim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.123-138
    • /
    • 2023
  • The Fourth Industrial Revolution and sensor technology have led to increased utilization of sensor data. In our modern society, data complexity is rising, and the extraction of valuable information has become crucial with the rapid changes in information technology (IT). Recurrent neural networks (RNN) and long short-term memory (LSTM) models have shown remarkable performance in natural language processing (NLP) and time series prediction. Consequently, there is a strong expectation that models excelling in NLP will also excel in time series prediction. However, current research on Transformer models for time series prediction remains limited. Traditional RNN and LSTM models have demonstrated superior performance compared to Transformers in big data analysis. Nevertheless, with continuous advancements in Transformer models, such as GPT-2 (Generative Pre-trained Transformer 2) and ProphetNet, they have gained attention in the field of time series prediction. This study aims to evaluate the classification performance and interval prediction of remaining useful life (RUL) using an advanced Transformer model. The performance of each model will be utilized to establish a health index (HI) for cutting blades, enabling real-time monitoring of machine health. The results are expected to provide valuable insights for machine monitoring, evaluation, and management, confirming the effectiveness of advanced Transformer models in time series analysis when applied in industrial settings.

Reconstruction of gusty wind speed time series from autonomous data logger records

  • Amezcua, Javier;Munoz, Raul;Probst, Oliver
    • Wind and Structures
    • /
    • v.14 no.4
    • /
    • pp.337-357
    • /
    • 2011
  • The collection of wind speed time series by means of digital data loggers occurs in many domains, including civil engineering, environmental sciences and wind turbine technology. Since averaging intervals are often significantly larger than typical system time scales, the information lost has to be recovered in order to reconstruct the true dynamics of the system. In the present work we present a simple algorithm capable of generating a real-time wind speed time series from data logger records containing the average, maximum, and minimum values of the wind speed in a fixed interval, as well as the standard deviation. The signal is generated from a generalized random Fourier series. The spectrum can be matched to any desired theoretical or measured frequency distribution. Extreme values are specified through a postprocessing step based on the concept of constrained simulation. Applications of the algorithm to 10-min wind speed records logged at a test site at 60 m height above the ground show that the recorded 10-min values can be reproduced by the simulated time series to a high degree of accuracy.

Comparison of long-term forecasting performance of export growth rate using time series analysis models and machine learning analysis (시계열 분석 모형 및 머신 러닝 분석을 이용한 수출 증가율 장기예측 성능 비교)

  • Seong-Hwi Nam
    • Korea Trade Review
    • /
    • v.46 no.6
    • /
    • pp.191-209
    • /
    • 2021
  • In this paper, various time series analysis models and machine learning models are presented for long-term prediction of export growth rate, and the prediction performance is compared and reviewed by RMSE and MAE. Export growth rate is one of the major economic indicators to evaluate the economic status. And It is also used to predict economic forecast. The export growth rate may have a negative (-) value as well as a positive (+) value. Therefore, Instead of using the ReLU function, which is often used for time series prediction of deep learning models, the PReLU function, which can have a negative (-) value as an output value, was used as the activation function of deep learning models. The time series prediction performance of each model for three types of data was compared and reviewed. The forecast data of long-term prediction of export growth rate was deduced by three forecast methods such as a fixed forecast method, a recursive forecast method and a rolling forecast method. As a result of the forecast, the traditional time series analysis model, ARDL, showed excellent performance, but as the time period of learning data increases, the performance of machine learning models including LSTM was relatively improved.

Comparison of time series clustering methods and application to power consumption pattern clustering

  • Kim, Jaehwi;Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.6
    • /
    • pp.589-602
    • /
    • 2020
  • The development of smart grids has enabled the easy collection of a large amount of power data. There are some common patterns that make it useful to cluster power consumption patterns when analyzing s power big data. In this paper, clustering analysis is based on distance functions for time series and clustering algorithms to discover patterns for power consumption data. In clustering, we use 10 distance measures to find the clusters that consider the characteristics of time series data. A simulation study is done to compare the distance measures for clustering. Cluster validity measures are also calculated and compared such as error rate, similarity index, Dunn index and silhouette values. Real power consumption data are used for clustering, with five distance measures whose performances are better than others in the simulation.

Threshold-asymmetric volatility models for integer-valued time series

  • Kim, Deok Ryun;Yoon, Jae Eun;Hwang, Sun Young
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.295-304
    • /
    • 2019
  • This article deals with threshold-asymmetric volatility models for over-dispersed and zero-inflated time series of count data. We introduce various threshold integer-valued autoregressive conditional heteroscedasticity (ARCH) models as incorporating over-dispersion and zero-inflation via conditional Poisson and negative binomial distributions. EM-algorithm is used to estimate parameters. The cholera data from Kolkata in India from 2006 to 2011 is analyzed as a real application. In order to construct the threshold-variable, both local constant mean which is time-varying and grand mean are adopted. It is noted via a data application that threshold model as an asymmetric version is useful in modelling count time series volatility.

Control Limits of Time Series Data using Hilbert-Huang Transform : Dealing with Nested Periods (힐버트-황 변환을 이용한 시계열 데이터 관리한계 : 중첩주기의 사례)

  • Suh, Jung-Yul;Lee, Sae Jae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.37 no.4
    • /
    • pp.35-41
    • /
    • 2014
  • Real-life time series characteristic data has significant amount of non-stationary components, especially periodic components in nature. Extracting such components has required many ad-hoc techniques with external parameters set by users in a case-by-case manner. In this study, we used Empirical Mode Decomposition Method from Hilbert-Huang Transform to extract them in a systematic manner with least number of ad-hoc parameters set by users. After the periodic components are removed, the remaining time-series data can be analyzed with traditional methods such as ARIMA model. Then we suggest a different way of setting control chart limits for characteristic data with periodic components in addition to ARIMA components.

Efficient Compression Algorithm with Limited Resource for Continuous Surveillance

  • Yin, Ling;Liu, Chuanren;Lu, Xinjiang;Chen, Jiafeng;Liu, Caixing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.11
    • /
    • pp.5476-5496
    • /
    • 2016
  • Energy efficiency of resource-constrained wireless sensor networks is critical in applications such as real-time monitoring/surveillance. To improve the energy efficiency and reduce the energy consumption, the time series data can be compressed before transmission. However, most of the compression algorithms for time series data were developed only for single variate scenarios, while in practice there are often multiple sensor nodes in one application and the collected data is actually multivariate time series. In this paper, we propose to compress the time series data by the Lasso (least absolute shrinkage and selection operator) approximation. We show that, our approach can be naturally extended for compressing the multivariate time series data. Our extension is novel since it constructs an optimal projection of the original multivariates where the best energy efficiency can be realized. The two algorithms are named by ULasso (Univariate Lasso) and MLasso (Multivariate Lasso), for which we also provide practical guidance for parameter selection. Finally, empirically evaluation is implemented with several publicly available real-world data sets from different application domains. We quantify the algorithm performance by measuring the approximation error, compression ratio, and computation complexity. The results show that ULasso and MLasso are superior to or at least equivalent to compression performance of LTC and PLAMlis. Particularly, MLasso can significantly reduce the smooth multivariate time series data, without breaking the major trends and important changes of the sensor network system.

Evaluating Efficacy of Hilbert-Huang Transform in Analyzing Manufacturing Time Series Data with Periodic Components (제조업의 주기성 시계열분석에서 힐버트 황 변환의 효용성 평가)

  • Lee, Sae-Jae;Suh, Jung-Yul
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.35 no.2
    • /
    • pp.106-112
    • /
    • 2012
  • Real-life time series characteristic data has significant amount of non-stationary components, especially periodic components in nature. Extracting such components has required many ad-hoc techniques with external parameters set by users in case-by-case manner. In our study, we evaluate whether Hilbert-Huang Transform, a new tool of time-series analysis can be used for effective analysis of such data. It is divided into two points : 1) how effective it is in finding periodic components, 2) whether we can use its results directly in detecting values outside control limits, for which a traditional method such as ARIMA had been used. We use glass furnace temperature data to illustrate the method.