• Title/Summary/Keyword: Time Series Data Processing

Search Result 322, Processing Time 0.031 seconds

Style-Based Transformer for Time Series Forecasting (시계열 예측을 위한 스타일 기반 트랜스포머)

  • Kim, Dong-Keon;Kim, Kwangsu
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.12
    • /
    • pp.579-586
    • /
    • 2021
  • Time series forecasting refers to predicting future time information based on past time information. Accurately predicting future information is crucial because it is used for establishing strategies or making policy decisions in various fields. Recently, a transformer model has been mainly studied for a time series prediction model. However, the existing transformer model has a limitation in that it has an auto-regressive structure in which the output result is input again when the prediction sequence is output. This limitation causes a problem in that accuracy is lowered when predicting a distant time point. This paper proposes a sequential decoding model focusing on the style transformation technique to handle these problems and make more precise time series forecasting. The proposed model has a structure in which the contents of past data are extracted from the transformer-encoder and reflected in the style-based decoder to generate the predictive sequence. Unlike the decoder structure of the conventional auto-regressive transformer, this structure has the advantage of being able to more accurately predict information from a distant view because the prediction sequence is output all at once. As a result of conducting a prediction experiment with various time series datasets with different data characteristics, it was shown that the model presented in this paper has better prediction accuracy than other existing time series prediction models.

Cleaning Noises from Time Series Data with Memory Effects

  • Cho, Jae-Han;Lee, Lee-Sub
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.4
    • /
    • pp.37-45
    • /
    • 2020
  • The development process of deep learning is an iterative task that requires a lot of manual work. Among the steps in the development process, pre-processing of learning data is a very costly task, and is a step that significantly affects the learning results. In the early days of AI's algorithm research, learning data in the form of public DB provided mainly by data scientists were used. The learning data collected in the real environment is mostly the operational data of the sensors and inevitably contains various noises. Accordingly, various data cleaning frameworks and methods for removing noises have been studied. In this paper, we proposed a method for detecting and removing noises from time-series data, such as sensor data, that can occur in the IoT environment. In this method, the linear regression method is used so that the system repeatedly finds noises and provides data that can replace them to clean the learning data. In order to verify the effectiveness of the proposed method, a simulation method was proposed, and a method of determining factors for obtaining optimal cleaning results was proposed.

An Efficient Vision-based Object Detection and Tracking using Online Learning

  • Kim, Byung-Gyu;Hong, Gwang-Soo;Kim, Ji-Hae;Choi, Young-Ju
    • Journal of Multimedia Information System
    • /
    • v.4 no.4
    • /
    • pp.285-288
    • /
    • 2017
  • In this paper, we propose a vision-based object detection and tracking system using online learning. The proposed system adopts a feature point-based method for tracking a series of inter-frame movement of a newly detected object, to estimate rapidly and toughness. At the same time, it trains the detector for the object being tracked online. Temporarily using the result of the failure detector to the object, it initializes the tracker back tracks to enable the robust tracking. In particular, it reduced the processing time by improving the method of updating the appearance models of the objects to increase the tracking performance of the system. Using a data set obtained in a variety of settings, we evaluate the performance of the proposed system in terms of processing time.

Application and Research of Monte Carlo Sampling Algorithm in Music Generation

  • MIN, Jun;WANG, Lei;PANG, Junwei;HAN, Huihui;Li, Dongyang;ZHANG, Maoqing;HUANG, Yantai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.10
    • /
    • pp.3355-3372
    • /
    • 2022
  • Composing music is an inspired yet challenging task, in that the process involves many considerations such as assigning pitches, determining rhythm, and arranging accompaniment. Algorithmic composition aims to develop algorithms for music composition. Recently, algorithmic composition using artificial intelligence technologies received considerable attention. In particular, computational intelligence is widely used and achieves promising results in the creation of music. This paper attempts to provide a survey on the music generation based on the Monte Carlo (MC) algorithm. First, transform the MIDI music format files to digital data. Among these data, use the logistic fitting method to fit the time series, obtain the time distribution regular pattern. Except for time series, the converted data also includes duration, pitch, and velocity. Second, using MC simulation to deal with them summed up their distribution law respectively. The two main control parameters are the value of discrete sampling and standard deviation. Processing the above parameters and converting the data to MIDI file, then compared with the output generated by LSTM neural network, evaluate the music comprehensively.

국가지하수 관측소의 장기수위관측자료를 활용한 관측주기 결정 연구

  • 김규범;김정우;원종호;이명재;이진용;이강근
    • Proceedings of the Korean Society of Soil and Groundwater Environment Conference
    • /
    • 2003.09a
    • /
    • pp.199-201
    • /
    • 2003
  • The monitoring effectiveness not only depends on the effectiveness of the network, but also the costs of the network. Generally the costs of the monitoring network are mainly on the equipment and personnel; the implementation and maintenance; the observation and sample connection; the sample analysis; and the data storage and processing. The cost of the monitoring network can be expressed as a function of monitoring frequency because the monitoring method can be an automatic or a manual measurement. To determine the sampling frequency of subsidiary groundwater monitoring stations, time series data of national groundwater monitoring stations were used. The proposed optimal sampling frequency for subsidiary groundwater monitoring station is about 7 to 20 days and the average frequency is about 2 weeks.

  • PDF

Bayesian estimation for frequency using resampling methods (재표본 방법론을 활용한 베이지안 주파수 추정)

  • Pak, Ro Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.877-888
    • /
    • 2017
  • Spectral analysis is used to determine the frequency of time series data. We first determine the frequency of the series through the power spectrum or the periodogram and then calculate the period of a cycle that may exist in a time series. Estimating the frequency using a Bayesian technique has been developed and proven to be useful; however, the Bayesian estimator for the frequency cannot be analytically solved through mathematical equations and may be handled numerically or computationally. In this paper, we make an inference on the Bayesian frequency through both resampling a parameter by Markov chain Monte Carlo (MCMC) methods and resampling data by bootstrap methods for a time series. We take the Korean real estate price index as an example for Bayesian frequency estimation. We have found a difference in the periods between the sale price index and the long term rental price index, but the difference is not statistically significant.

Anomaly Detection of Machining Process based on Power Load Analysis (전력 부하 분석을 통한 절삭 공정 이상탐지)

  • Jun Hong Yook;Sungmoon Bae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.4
    • /
    • pp.173-180
    • /
    • 2023
  • Smart factory companies are installing various sensors in production facilities and collecting field data. However, there are relatively few companies that actively utilize collected data, academic research using field data is actively underway. This study seeks to develop a model that detects anomalies in the process by analyzing spindle power data from a company that processes shafts used in automobile throttle valves. Since the data collected during machining processing is time series data, the model was developed through unsupervised learning by applying the Holt Winters technique and various deep learning algorithms such as RNN, LSTM, GRU, BiRNN, BiLSTM, and BiGRU. To evaluate each model, the difference between predicted and actual values was compared using MSE and RMSE. The BiLSTM model showed the optimal results based on RMSE. In order to diagnose abnormalities in the developed model, the critical point was set using statistical techniques in consultation with experts in the field and verified. By collecting and preprocessing real-world data and developing a model, this study serves as a case study of utilizing time-series data in small and medium-sized enterprises.

Relationships Between the Characteristics of the Business Data Set and Forecasting Accuracy of Prediction models (시계열 데이터의 성격과 예측 모델의 예측력에 관한 연구)

  • 이원하;최종욱
    • Journal of Intelligence and Information Systems
    • /
    • v.4 no.1
    • /
    • pp.133-147
    • /
    • 1998
  • Recently, many researchers have been involved in finding deterministic equations which can accurately predict future event, based on chaotic theory, or fractal theory. The theory says that some events which seem very random but internally deterministic can be accurately predicted by fractal equations. In contrast to the conventional methods, such as AR model, MA, model, or ARIMA model, the fractal equation attempts to discover a deterministic order inherent in time series data set. In discovering deterministic order, researchers have found that neural networks are much more effective than the conventional statistical models. Even though prediction accuracy of the network can be different depending on the topological structure and modification of the algorithms, many researchers asserted that the neural network systems outperforms other systems, because of non-linear behaviour of the network models, mechanisms of massive parallel processing, generalization capability based on adaptive learning. However, recent survey shows that prediction accuracy of the forecasting models can be determined by the model structure and data structures. In the experiments based on actual economic data sets, it was found that the prediction accuracy of the neural network model is similar to the performance level of the conventional forecasting model. Especially, for the data set which is deterministically chaotic, the AR model, a conventional statistical model, was not significantly different from the MLP model, a neural network model. This result shows that the forecasting model. This result shows that the forecasting model a, pp.opriate to a prediction task should be selected based on characteristics of the time series data set. Analysis of the characteristics of the data set was performed by fractal analysis, measurement of Hurst index, and measurement of Lyapunov exponents. As a conclusion, a significant difference was not found in forecasting future events for the time series data which is deterministically chaotic, between a conventional forecasting model and a typical neural network model.

  • PDF

Training Method of Artificial Neural Networks for Implementation of Automatic Composition Systems (자동작곡시스템 구현을 위한 인공신경망의 학습방법)

  • Cho, Jae-Min;Ryu, Eun Mi;Oh, Jin-Woo;Jung, Sung Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.8
    • /
    • pp.315-320
    • /
    • 2014
  • Composition is a creative activity of a composer in order to express his or her emotion into melody based on their experience. However, it is very hard to implement an automatic composition program whose composition process is the same as the composer. On the basis that the creative activity is possible from the imitation we propose a method to implement an automatic composition system using the learning capability of ANN(Artificial Neural Networks). First, we devise a method to convert a melody into time series that ANN can train and then another method to learn the repeated melody with melody bar for correct training of ANN. After training of the time series to ANN, we feed a new time series into the ANN, then the ANN produces a full new time series which is converted a new melody. But post processing is necessary because the produced melody does not fit to the tempo and harmony of music theory. In this paper, we applied a tempo post processing using tempo post processing program, but the harmony post processing is done by human because it is difficult to implement. We will realize the harmony post processing program as a further work.

A Study of Similarity Measures on Multidimensional Data Sequences Using Semantic Information (의미 정보를 이용한 다차원 데이터 시퀀스의 유사성 척도 연구)

  • Lee, Seok-Lyong;Lee, Ju-Hong;Chun, Seok-Ju
    • The KIPS Transactions:PartD
    • /
    • v.10D no.2
    • /
    • pp.283-292
    • /
    • 2003
  • One-dimensional time-series data have been studied in various database applications such as data mining and data warehousing. However, in the current complex business environment, multidimensional data sequences (MDS') become increasingly important in addition to one-dimensional time-series data. For example, a video stream can be modeled as an MDS in the multidimensional space with respect to color and texture attributes. In this paper, we propose the effective similarity measures on which the similar pattern retrieval is based. An MDS is partitioned into segments, each of which is represented by various geometric and semantic features. The similarity measures are defined on the basis of these segments. Using the measures, irrelevant segments are pruned from a database with respect to a given query. Both data sequences and query sequences are partitioned into segments, and the query processing is based upon the comparison of the features between data and query segments, instead of scanning all data elements of entire sequences.