• Title/Summary/Keyword: high dimensional time series

Search Result 73, Processing Time 0.027 seconds

Performance Evaluation of a Feature-Importance-based Feature Selection Method for Time Series Prediction

  • Hyun, Ahn
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.1
    • /
    • pp.82-89
    • /
    • 2023
  • Various machine-learning models may yield high predictive power for massive time series for time series prediction. However, these models are prone to instability in terms of computational cost because of the high dimensionality of the feature space and nonoptimized hyperparameter settings. Considering the potential risk that model training with a high-dimensional feature set can be time-consuming, we evaluate a feature-importance-based feature selection method to derive a tradeoff between predictive power and computational cost for time series prediction. We used two machine learning techniques for performance evaluation to generate prediction models from a retail sales dataset. First, we ranked the features using impurity- and Local Interpretable Model-agnostic Explanations (LIME) -based feature importance measures in the prediction models. Then, the recursive feature elimination method was applied to eliminate unimportant features sequentially. Consequently, we obtained a subset of features that could lead to reduced model training time while preserving acceptable model performance.

Hybrid Lower-Dimensional Transformation for Similar Sequence Matching (유사 시퀀스 매칭을 위한 하이브리드 저차원 변환)

  • Moon, Yang-Sae;Kim, Jin-Ho
    • The KIPS Transactions:PartD
    • /
    • v.15D no.1
    • /
    • pp.31-40
    • /
    • 2008
  • We generally use lower-dimensional transformations to convert high-dimensional sequences into low-dimensional points in similar sequence matching. These traditional transformations, however, show different characteristics in indexing performance by the type of time-series data. It means that the selection of lower-dimensional transformations makes a significant influence on the indexing performance in similar sequence matching. To solve this problem, in this paper we propose a hybrid approach that integrates multiple transformations and uses them in a single multidimensional index. We first propose a new notion of hybrid lower-dimensional transformation that exploits different lower-dimensional transformations for a sequence. We next define the hybrid distance to compute the distance between the transformed sequences. We then formally prove that the hybrid approach performs the similar sequence matching correctly. We also present the index building and the similar sequence matching algorithms that use the hybrid approach. Experimental results for various time-series data sets show that our hybrid approach outperforms the single transformation-based approach. These results indicate that the hybrid approach can be widely used for various time-series data with different characteristics.

Volatility for High Frequency Time Series Toward fGARCH(1,1) as a Functional Model

  • Hwang, Sun Young;Yoon, Jae Eun
    • Quantitative Bio-Science
    • /
    • v.37 no.2
    • /
    • pp.73-79
    • /
    • 2018
  • As high frequency (HF, for short) time series is now prevalent in the presence of real time big data, volatility computations based on traditional ARCH/GARCH models need to be further developed to suit the high frequency characteristics. This article reviews realized volatilities (RV) and multivariate GARCH (MGARCH) to deal with high frequency volatility computations. As a (functional) infinite dimensional models, the fARCH and fGARCH are introduced to accommodate ultra high frequency (UHF) volatilities. The fARCH and fGARCH models are developed in the recent literature by Hormann et al. [1] and Aue et al. [2], respectively, and our discussions are mainly based on these two key articles. Real data applications to domestic UHF financial time series are illustrated.

Thermal-hydraulic simulation and evaluation of a natural circulation thermosyphon loop for a reactor cavity cooling system of a high-temperature reactor

  • Swart, R.;Dobson, R.T.
    • Nuclear Engineering and Technology
    • /
    • v.52 no.2
    • /
    • pp.271-278
    • /
    • 2020
  • The investigation into a full-scale 27 m high, by 6 m wide, thermosyphon loop. The simulation model is based on a one-dimensional axially-symmetrical control volume approach, where the loop is divided into a series of discreet control volumes. The three conservation equations, namely, mass, momentum and energy, were applied to these control volumes and solved with an explicit numerical method. The flow is assumed to be quasi-static, implying that the mass-flow rate changes over time. However, at any instant in time the mass-flow rate is constant around the loop. The boussinesq approximation was invoked, and a reasonable correlation between the experimental and theoretical results was obtained. Experimental results are presented and the flow regimes of the working fluid inside the loop identified. The results indicate that a series of such thermosyphon loops can be used as a cavity cooling system and that the one-dimensional theoretical model can predict the internal temperature and mass-flow rate of the thermosyphon loop.

Physical Database Design for DFT-Based Multidimensional Indexes in Time-Series Databases (시계열 데이터베이스에서 DFT-기반 다차원 인덱스를 위한 물리적 데이터베이스 설계)

  • Kim, Sang-Wook;Kim, Jin-Ho;Han, Byung-ll
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.11
    • /
    • pp.1505-1514
    • /
    • 2004
  • Sequence matching in time-series databases is an operation that finds the data sequences whose changing patterns are similar to that of a query sequence. Typically, sequence matching hires a multi-dimensional index for its efficient processing. In order to alleviate the dimensionality curse problem of the multi-dimensional index in high-dimensional cases, the previous methods for sequence matching apply the Discrete Fourier Transform(DFT) to data sequences, and take only the first two or three DFT coefficients as organizing attributes of the multi-dimensional index. This paper first points out the problems in such simple methods taking the firs two or three coefficients, and proposes a novel solution to construct the optimal multi -dimensional index. The proposed method analyzes the characteristics of a target database, and identifies the organizing attributes having the best discrimination power based on the analysis. It also determines the optimal number of organizing attributes for efficient sequence matching by using a cost model. To show the effectiveness of the proposed method, we perform a series of experiments. The results show that the Proposed method outperforms the previous ones significantly.

  • PDF

High-dimensional change point detection using MOSUM-based sparse projection (MOSUM 성근 프로젝션을 이용한 고차원 시계열의 변화점 추정)

  • Kim, Moonjung;Baek, Changryong
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.63-75
    • /
    • 2022
  • This paper proposes the so-called MOSUM-based sparse projection method for change points detection in high-dimensional time series. Our method is inspired by Wang and Samworth (2018), however, our method improves their method in two ways. One is to find change points all at once, so it minimizes sequential error. The other is localized so that more robust to the mean changes offsetting each other. We also propose data-driven threshold selection using block wild bootstrap. A comprehensive simulation study shows that our method performs reasonably well in finite samples. We also illustrate our method to stock prices consisting of S&P 500 index, and found four change points in recent 6 years.

Time Series Representation Combining PIPs Detection and Persist Discretization Techniques for Time Series Classification (시계열 분류를 위한 PIPs 탐지와 Persist 이산화 기법들을 결합한 시계열 표현)

  • Park, Sang-Ho;Lee, Ju-Hong
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.9
    • /
    • pp.97-106
    • /
    • 2010
  • Various time series representation methods have been suggested in order to process time series data efficiently and effectively. SAX is the representative time series representation method combining segmentation and discretization techniques, which has been successfully applied to the time series classification task. But SAX requires a large number of segments in order to represent the meaningful dynamic patterns of time series accurately, since it loss the dynamic property of time series in the course of smoothing the movement of time series. Therefore, this paper suggests a new time series representation method that combines PIPs detection and Persist discretization techniques. The suggested method represents the dynamic movement of high-diemensional time series in a lower dimensional space by detecting PIPs indicating the important inflection points of time series. And it determines the optimal discretizaton ranges by applying self-transition and marginal probabilities distributions to KL divergence measure. It minimizes the information loss in process of the dimensionality reduction. The suggested method enhances the performance of time series classification task by minimizing the information loss in the course of dimensionality reduction.

High-Dimensional Clustering Technique using Incremental Projection (점진적 프로젝션을 이용한 고차원 글러스터링 기법)

  • Lee, Hye-Myung;Park, Young-Bae
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.568-576
    • /
    • 2001
  • Most of clustering algorithms data to degenerate rapidly on high dimensional spaces. Moreover, high dimensional data often contain a significant a significant of noise. which causes additional ineffectiveness of algorithms. Therefore it is necessary to develop algorithms adapted to the structure and characteristics of the high dimensional data. In this paper, we propose a clustering algorithms CLIP using the projection The CLIP is designed to overcome efficiency and/or effectiveness problems on high dimensional clustering and it is the is based on clustering on each one dimensional subspace but we use the incremental projection to recover high dimensional cluster and to reduce the computational cost significantly at time To evaluate the performance of CLIP we demonstrate is efficiency and effectiveness through a series of experiments on synthetic data sets.

  • PDF

Detection of Low-Level Human Action Change for Reducing Repetitive Tasks in Human Action Recognition (사람 행동 인식에서 반복 감소를 위한 저수준 사람 행동 변화 감지 방법)

  • Noh, Yohwan;Kim, Min-Jung;Lee, DoHoon
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.4
    • /
    • pp.432-442
    • /
    • 2019
  • Most current human action recognition methods based on deep learning methods. It is required, however, a very high computational cost. In this paper, we propose an action change detection method to reduce repetitive human action recognition tasks. In reality, simple actions are often repeated and it is time consuming process to apply high cost action recognition methods on repeated actions. The proposed method decides whether action has changed. The action recognition is executed only when it has detected action change. The action change detection process is as follows. First, extract the number of non-zero pixel from motion history image and generate one-dimensional time-series data. Second, detecting action change by comparison of difference between current time trend and local extremum of time-series data and threshold. Experiments on the proposed method achieved 89% balanced accuracy on action change data and 61% reduced action recognition repetition.

Electricity Price Prediction Model Based on Simultaneous Perturbation Stochastic Approximation

  • Ko, Hee-Sang;Lee, Kwang-Y.;Kim, Ho-Chan
    • Journal of Electrical Engineering and Technology
    • /
    • v.3 no.1
    • /
    • pp.14-19
    • /
    • 2008
  • The paper presents an intelligent time series model to predict uncertain electricity market price in the deregulated industry environment. Since the price of electricity in a deregulated market is very volatile, it is difficult to estimate an accurate market price using historically observed data. The parameter of an intelligent time series model is obtained based on the simultaneous perturbation stochastic approximation (SPSA). The SPSA is flexible to use in high dimensional systems. Since prediction models have their modeling error, an error compensator is developed as compensation. The SPSA based intelligent model is applied to predict the electricity market price in the Pennsylvania-New Jersey-Maryland (PJM) electricity market.