• Title/Summary/Keyword: 시계열 군집화

Search Result 43, Processing Time 0.022 seconds

A Study On Predicting Stock Prices Of Hallyu Content Companies Using Two-Stage k-Means Clustering (2단계 k-평균 군집화를 활용한 한류컨텐츠 기업 주가 예측 연구)

  • Kim, Jeong-Woo
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.7
    • /
    • pp.169-179
    • /
    • 2021
  • This study shows that the two-stage k-means clustering method can improve prediction performance by predicting the stock price, To this end, this study introduces the two-stage k-means clustering algorithm and tests the prediction performance through comparison with various machine learning techniques. It selects the cluster close to the prediction target obtained from the k-means clustering, and reapplies the k-means clustering method to the cluster to search for a cluster closer to the actual value. As a result, the predicted value of this method is shown to be closer to the actual stock price than the predicted values of other machine learning techniques. Furthermore, it shows a relatively stable predicted value despite the use of a relatively small cluster. Accordingly, this method can simultaneously improve the accuracy and stability of prediction, and it can be considered as the new clustering method useful for small data. In the future, developing the two-stage k-means clustering is required for the large-scale data application.

Evolutionary Computation-based Hybird Clustring Technique for Manufacuring Time Series Data (제조 시계열 데이터를 위한 진화 연산 기반의 하이브리드 클러스터링 기법)

  • Oh, Sanghoun;Ahn, Chang Wook
    • Smart Media Journal
    • /
    • v.10 no.3
    • /
    • pp.23-30
    • /
    • 2021
  • Although the manufacturing time series data clustering technique is an important grouping solution in the field of detecting and improving manufacturing large data-based equipment and process defects, it has a disadvantage of low accuracy when applying the existing static data target clustering technique to time series data. In this paper, an evolutionary computation-based time series cluster analysis approach is presented to improve the coherence of existing clustering techniques. To this end, first, the image shape resulting from the manufacturing process is converted into one-dimensional time series data using linear scanning, and the optimal sub-clusters for hierarchical cluster analysis and split cluster analysis are derived based on the Pearson distance metric as the target of the transformation data. Finally, by using a genetic algorithm, an optimal cluster combination with minimal similarity is derived for the two cluster analysis results. And the performance superiority of the proposed clustering is verified by comparing the performance with the existing clustering technique for the actual manufacturing process image.

Daily Behavior Pattern Extraction using Time-Series Behavioral Data of Dairy Cows and k-Means Clustering (행동 시계열 데이터와 k-평균 군집화를 통한 젖소의 일일 행동패턴 검출)

  • Lee, Seonghun;Park, Gicheol;Park, Jaehwa
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.1
    • /
    • pp.83-92
    • /
    • 2021
  • There are continuous and tremendous attempts to apply various sensor systems and ICTs into the dairy science for data accumulation and improvement of dairy productivity. However, these only concerns the fields which directly affect to the dairy productivity such as the number of individuals and the milk production amount, while researches on the physiology aspects of dairy cows are not enough which are fundamentally involved in the dairy productivity. This paper proposes the basic approach for extraction of daily behavior pattern from hourly behavioral data of dairy cows to identify the health status and stress. Total four clusters were grouped by k-means clustering and the reasonability was proved by visualization of the data in each groups and the representatives of each groups. We hope that provided results should lead to the further researches on catching abnormalities and disease signs of dairy cows.

A Modeling Methodology for Analysis of Dynamic Systems Using Heuristic Search and Design of Interface for CRM (휴리스틱 탐색을 통한 동적시스템 분석을 위한 모델링 방법과 CRM 위한 인터페이스 설계)

  • Jeon, Jin-Ho;Lee, Gye-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.4
    • /
    • pp.179-187
    • /
    • 2009
  • Most real world systems contain a series of dynamic and complex phenomena. One of common methods to understand these systems is to build a model and analyze the behavior of them. A two-step methodology comprised of clustering and then model creation is proposed for the analysis on time series data. An interface is designed for CRM(Customer Relationship Management) that provides user with 1:1 customized information using system modeling. It was confirmed from experiments that better clustering would be derived from model based approach than similarity based one. Clustering is followed by model creation over the clustered groups, by which future direction of time series data movement could be predicted. The effectiveness of the method was validated by checking how similarly predicted values from the models move together with real data such as stock prices.

Screening and Clustering for Time-course Yeast Microarray Gene Expression Data using Gaussian Process Regression (효모 마이크로어레이 유전자 발현데이터에 대한 가우시안 과정 회귀를 이용한 유전자 선별 및 군집화)

  • Kim, Jaehee;Kim, Taehoun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.3
    • /
    • pp.389-399
    • /
    • 2013
  • This article introduces Gaussian process regression and shows its application with time-course microarray gene expression data. Gene screening for yeast cell cycle microarray expression data is accomplished with a ratio of log marginal likelihood that uses Gaussian process regression with a squared exponential covariance kernel function. Gaussian process regression fitting with each gene is done and shown with the nine top ranking genes. With the screened data the Gaussian model-based clustering is done and its silhouette values are calculated for cluster validity.

An Intelligent Monitoring System of Semiconductor Processing Equipment using Multiple Time-Series Pattern Recognition (다중 시계열 패턴인식을 이용한 반도체 생산장치의 지능형 감시시스템)

  • Lee, Joong-Jae;Kwon, O-Bum;Kim, Gye-Young
    • The KIPS Transactions:PartD
    • /
    • v.11D no.3
    • /
    • pp.709-716
    • /
    • 2004
  • This paper describes an intelligent real-time monitoring system of a semiconductor processing equipment, which determines normal or not for a wafer in processing, using multiple time-series pattern recognition. The proposed system consists of three phases, initialization, learning and real-time prediction. The initialization phase sets the weights and tile effective steps for all parameters of a monitoring equipment. The learning phase clusters time series patterns, which are producted and fathered for processing wafers by the equipment, using LBG algorithm. Each pattern has an ACI which is measured by a tester at the end of a process The real-time prediction phase corresponds a time series entered by real-time with the clustered patterns using Dynamic Time Warping, and finds the best matched pattern. Then it calculates a predicted ACI from a combination of the ACI, the difference and the weights. Finally it determines Spec in or out for the wafer. The proposed system is tested on the data acquired from etching device. The results show that the error between the estimated ACI and the actual measurement ACI is remarkably reduced according to the number of learning increases.

A Reexamination on the Influence of Fine-particle between Districts in Seoul from the Perspective of Information Theory (정보이론 관점에서 본 서울시 지역구간의 미세먼지 영향력 재조명)

  • Lee, Jaekoo;Lee, Taehoon;Yoon, Sungroh
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.2
    • /
    • pp.109-114
    • /
    • 2015
  • This paper presents a computational model on the transfer of airborne fine particles to analyze the similarities and influences among the 25 districts in Seoul by quantifying a time series data collected from each district. The properties of each district are driven with the model of a time series of the fine particle concentrations, and the calculation of edge-based weights are carried out with the transfer entropies between all pairs of the districts. We applied a modularity-based graph clustering technique to detect the communities among the 25 districts. The result indicates the discovered clusters correspond to a high transfer-entropy group among the communities with geographical adjacency or high in-between traffic volumes. We believe that this approach can be further extended to the discovery of significant flows of other indicators causing environmental pollution.

A Direction of Politic Support for Infectious Disease in Busan Using Time-series Clustering: Focusing on COVID-19 Cases (시계열 군집을 활용한 부산시 감염병 지원 정책 방향: COVID-19 사례를 중심으로)

  • Kwun, Hyeon-Ho;Kim, Do-Hee;Park, Chan-Ho;Lee, Eun-Ju;Cho, KiHaing;Bae, Hye-Rim
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.125-138
    • /
    • 2020
  • After the spread of COVID-19 in 2020, the country's Crisis Alert Level went up to the highest level, Level 4. Respond of COVID-19 pandemic, Governments, and cities secured each province's duty for the citizens. The government provided health assistance first and stepped forward to support the necessary resources for the citizens. Busan City proposed policy response to prepare and implement the Corona support for each county as well. The high occupant rate of self-business owners lost basic incomes, and the effect varies on industries. In our paper, to avoid any crisis in such an epidemic, we propose a clustering analysis for the guidance of policy support for Busan City. By analyzing patterns and clustering on districts and Sectors, we would like to provide reference materials for determining the direction of support and guiding preemptive response in the event of a similar epidemic.

Software Measurement by Analyzing Multiple Time-Series Patterns (다중 시계열 패턴 분석에 의한 소프트웨어 계측)

  • Kim Gye-Young
    • Journal of Internet Computing and Services
    • /
    • v.6 no.1
    • /
    • pp.105-114
    • /
    • 2005
  • This paper describes a new measuring technique by analysing multiple time-series patterns. This paper's goal is that extracts a really measured value having a sample pattern which is the best matched with an inputted time-series, and calculates a difference ratio with the value. Therefore, the proposed technique is not a recognition but a measurement. and not a hardware but a software. The proposed technique is consisted of three stages, initialization, learning and measurement. In the initialization stage, it decides weights of all parameters using importance given by an operator. In the learning stage, it classifies sample patterns using LBG and DTW algorithm, and then creates code sequences for all the patterns. In the measurement stage, it creates a code sequence for an inputted time-series pattern, finds samples having the same code sequence by hashing, and then selects the best matched sample. Finally it outputs the really measured value with the sample and the difference ratio. For the purpose of performance evaluation, we tested on multiple time-series patterns obtained from etching machine which is a semiconductor manufacturing.

  • PDF

Clustering Korean Stock Return Data Based on GARCH Model (이분산 시계열모형을 이용한 국내주식자료의 군집분석)

  • Park, Man-Sik;Kim, Na-Young;Kim, Hee-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.6
    • /
    • pp.925-937
    • /
    • 2008
  • In this study, we considered the clustering analysis for stock return traded in the stock market. Most of financial time-series data, for instance, stock price and exchange rate have conditional heterogeneous variability depending on time, and, hence, are not properly applied to the autoregressive moving-average(ARMA) model with assumption of constant variance. Moreover, the variability is font and center for stock investors as well as academic researchers. So, this paper focuses on the generalized autoregressive conditional heteroscedastic(GARCH) model which is known as a solution for capturing the conditional variance(or volatility). We define the metrics for similarity of unconditional volatility and for homogeneity of model structure, and, then, evaluate the performances of the metrics. In real application, we do clustering analysis in terms of volatility and structure with stock return of the 11 Korean companies measured for the latest three years.