• Title/Summary/Keyword: 시계열 및 군집 분석

Search Result 48, Processing Time 0.027 seconds

A spectrum based evaluation algorithm for micro scale weather analysis module with application to time series cluster analysis (스펙트럼분석 기반의 미기상해석모듈 평가알고리즘 제안 및 시계열 군집분석에의 응용)

  • Kim, Hea-Jung;Kwak, Hwa-Ryun;Kim, Yu-Na;Choi, Young-Jean
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.41-53
    • /
    • 2015
  • In meteorological field, many researchers have tried to develop micro scale weather analysis modules for providing real-time weather information service in the metropolitan area. This effort enables us to cope with various economic and social harms coming from serious change in the micro meteorology of a metropolitan area due to rapid urbanization such as quantitative expansions in its urban activity, growth of population, and building concentration. The accuracy of the micro scale weather analysis modules (MSWAM) directly related to usefulness and quality of the real-time weather information service in the metropolitan area. This paper design a evaluation system along with verification tools that sufficiently accommodate spatio-temporal characteristics of the outputs of the MSWAM. For this we proposes a test for the equality of mean vectors of the output series of the MSWAM and corresponding observed time series by using a spectral analysis technique. As a byproduct, a time series cluster analysis method, using a function of the test statistic as the distance measure, is developed. A real data application is given to demonstrate the utility of the method.

Gene Screening and Clustering of Yeast Microarray Gene Expression Data (효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석)

  • Lee, Kyung-A;Kim, Tae-Houn;Kim, Jae-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1077-1094
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. To reflect the characteristics of a time-course data, we screen the genes using the test statistics with Fourier coefficients applying a FDR procedure. We compare the results done by model-based clustering, K-means, PAM, SOM, hierarchical Ward method and Fuzzy method with the yeast data. As the validity measure for clustering results, connectivity, Dunn index and silhouette values are computed and compared. A biological interpretation with GO analysis is also included.

A Statistical Analysis of the Causes of Marine Incidents occurring during Berthing (정박 중 발생한 준해양사고 원인에 대한 통계 분석 연구)

  • Roh, Boem-Seok;Kang, Suk-Young
    • Journal of Navigation and Port Research
    • /
    • v.45 no.3
    • /
    • pp.95-101
    • /
    • 2021
  • Marine Incidents based on Heinrich's law are very important in preventing accidents. However, marine Incident data are mainly qualitative and are used to prevent similar accidents through case sharing rather than statistical analysis, which can be confirmed in the marine Incident-related data posted in the Korea Maritime Safety Tribunal. Therefore, this study derived quantitative results by analyzing the causes of marine incidents during berthing using various methods of statistical analysis. To this end, data involving marine incidents from various shipping companies were collected and reclassified for easy analysis. The main keywords were derived via primary analysis using text mining. Only meaningful words were selected via verification by an expert group, and time series and cluster analysis were performed to predict marine incidents that may occur during berthing. Although the role of an expert group was still required during the analysis, it was confirmed that quantitative analysis of marine incidents was feasible, and iused to provide cause and accident prevention information.

Software Measurement by Analyzing Multiple Time-Series Patterns (다중 시계열 패턴 분석에 의한 소프트웨어 계측)

  • Kim Gye-Young
    • Journal of Internet Computing and Services
    • /
    • v.6 no.1
    • /
    • pp.105-114
    • /
    • 2005
  • This paper describes a new measuring technique by analysing multiple time-series patterns. This paper's goal is that extracts a really measured value having a sample pattern which is the best matched with an inputted time-series, and calculates a difference ratio with the value. Therefore, the proposed technique is not a recognition but a measurement. and not a hardware but a software. The proposed technique is consisted of three stages, initialization, learning and measurement. In the initialization stage, it decides weights of all parameters using importance given by an operator. In the learning stage, it classifies sample patterns using LBG and DTW algorithm, and then creates code sequences for all the patterns. In the measurement stage, it creates a code sequence for an inputted time-series pattern, finds samples having the same code sequence by hashing, and then selects the best matched sample. Finally it outputs the really measured value with the sample and the difference ratio. For the purpose of performance evaluation, we tested on multiple time-series patterns obtained from etching machine which is a semiconductor manufacturing.

  • PDF

Time Series Patterns and Clustering of Rotifer Community in Relation with Topographical Characteristics in Lentic Ecosystems (정수생태계의 지형적인 요인 변화와 윤충류 출현 종 수 및 개체군 밀도 변동에 대한 연구)

  • Oh, Hye-Ji;Heo, Yu-Ji;Chang, Kwang-Hyeon;Kim, Hyun-Woo
    • Korean Journal of Ecology and Environment
    • /
    • v.54 no.4
    • /
    • pp.390-397
    • /
    • 2021
  • The time series data of rotifer community focusing on the species number and total density were collected from 29 reservoirs located at Jeonnam Province from 2008 to 2016 quarterly. The reservoirs had similar weather condition during the study period, but their sizes and water qualities were different. To analyze the temporal dynamics of rotifer community, the medians, ranges, outliers and coefficient of variation (CV) value of rotifer species number and abundance were compared. For the temporal trend analysis, time series of each reservoir data were compared and clustered using the dynamic time warping function of the R package "dtwclust". Small-sized reservoirs showed higher variability in rotifer abundance with more frequent outliers than large-sized reservoirs. On the other hand, apparent pattern was not observed for the rotifer species number. For the temporal pattern of rotifer density, COD, phytoplankton abundance fluctuation, and cladoceran abundance fluctuation have been suggested as potential factor affecting the rotifer abundance dynamics.

A Study of Search Methodology for Efficient Clustering (효율적 군집화를 위한 탐색 방법 연구)

  • Jeon, Jin-Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.10a
    • /
    • pp.571-573
    • /
    • 2010
  • Most real world system such as world economy, management, medical and engineering applications contain a series of complex phenomena. One of common methods to understand these system is to build a model and analyze the behavior of the system. As a first step, Determining the best clusters on data. As a second step, Determining the model of the cluster. In this paper, we investigated heuristic search methods for efficient clustering.

  • PDF

Technology Development Strategy of Piggyback Transportation System Using Topic Modeling Based on LDA Algorithm

  • Jun, Sung-Chan;Han, Seong-Ho;Kim, Sang-Baek
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.12
    • /
    • pp.261-270
    • /
    • 2020
  • In this study, we identify promising technologies for Piggyback transportation system by analyzing the relevant patent information. In order for this, we first develop the patent database by extracting relevant technology keywords from the pioneering research papers for the Piggyback flactcar system. We then employed textmining to identify the frequently referred words from the patent database, and using these words, we applied the LDA (Latent Dirichlet Allocation) algorithm in order to identify "topics" that are corresponding to "key" technologies for the Piggyback system. Finally, we employ the ARIMA model to forecast the trends of these "key" technologies for technology forecasting, and identify the promising technologies for the Piggyback system. with keyword search method the patent analysis. The results show that data-driven integrated management system, operation planning system and special cargo (especially fluid and gas) handling/storage technologies are identified to be the "key" promising technolgies for the future of the Piggyback system, and data reception/analysis techniques must be developed in order to improve the system performance. The proposed procedure and analysis method provides useful insights to develop the R&D strategy and the technology roadmap for the Piggyback system.

Classification of Land Cover over the Korean Peninsula Using Polar Orbiting Meteorological Satellite Data (극궤도 기상위성 자료를 이용한 한반도의 지면피복 분류)

  • Suh, Myoung-Seok;Kwak, Chong-Heum;Kim, Hee-Soo;Kim, Maeng-Ki
    • Journal of the Korean earth science society
    • /
    • v.22 no.2
    • /
    • pp.138-146
    • /
    • 2001
  • The land cover over Korean peninsula was classified using a multi-temporal NOAA/AVHRR (Advanced Very High Resolution Radiometer) data. Four types of phenological data derived from the 10-day composited NDVI (Normalized Differences Vegetation Index), maximum and annual mean land surface temperature, and topographical data were used not only reducing the data volume but also increasing the accuracy of classification. Self organizing feature map (SOFM), a kind of neural network technique, was used for the clustering of satellite data. We used a decision tree for the classification of the clusters. When we compared the classification results with the time series of NDVI and some other available ground truth data, the urban, agricultural area, deciduous tree and evergreen tree were clearly classified.

  • PDF

한국의 기후학 반세기:회고와 전망

  • 이현영
    • Journal of the Korean Geographical Society
    • /
    • v.31 no.2
    • /
    • pp.128-137
    • /
    • 1996
  • 한국의 기후학 연구성과는 1958년 발표된 이후 약간의 기복은 있었으나 꾸준히 발 전하여 왔다. 연구성과를 하부 분야별로 보면 기후학 일반(43.5%)이 가장 많았고, 종관기후 학(34.7%), 기후변화(13.0%) 그리고 응용기후학(8.8%)으로 구성되어 있으나 근래에는 응용 기후학 분야에 대한 연구가 서서히 증가하고 있다. 1970년대 이전에는 주로 지상 기후요소 간의 기상자료를 사용하여 상관관계 출현빈도.시계열분석 등으로 전국 규모의 기후특성을 규명한 데 반하여 최근에는 시계열분석과 더불어 군집.주성분.인자분석 등 다변량 분석기 법 등의 통계기법이 많이 활용되고 있다. 초기에는 지상기상자료를 주로 연구에 사용하였는 데 점차 고층기상자료와 인공위성자료를 활용하면서 국지기후 연구와 더불어 기후예측 모델 의 구축단계까지 발달하였다. 그러나 한국기후학이 당면한 문제는 인적자원의 절대적인 빈 곤과 더불어 인접분야에 비하여 연구환경이 열악한 것이다. 즉, 대학에서는 비전공자에 의한 기후학 교육이 빈번하고, 국지기후 연구의 경우는 실측을 요하기도 하는데 자료의 생성 및 분석에 필요한 장비가 절대적으로 부족하다. 따라서 한국의 기후학의 발전을 도모하려면 기 후학자의 배출이 급선무이고, 기후학자는 물론, 대학 및 연구소간의 연구 및 자료 교류 등의 상호협조가 요청된다.

  • PDF

A Realtime Wearable System for Upper Body Rehabilitation of Disabled (장애인 상지 재활운동 지원을 위한 실시간 웨어러블 시스템)

  • Su-Bin Oh;Min-Jeong Kang;Min-Goo Lee;Sang-Min Lee
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.420-422
    • /
    • 2023
  • 본 연구는 웨어러블 디바이스를 활용하여 장애인 재활운동 보조를 위한 AI 기반의 맞춤형 서비스 개발을 소개한다. 해당 서비스는 웨어러블 디바이스를 장착한 상태로 운동 중인 사용자의 심박수, 소모 칼로리, 운동 시간 등의 센서 데이터를 수집 및 관리한다. 사용자 생체 데이터는 클라이언트 서버 간 실시간 통신으로 관리되며, django rest framework 로 구축된 서버에 저장된다. 제안 시스템을 통해 수집된 데이터는 시계열 군집화 분석을 위해 k-means clustering 과 k-shape clustering 을 활용하여 체력 평가의 핵심 지표인 심박수를 분석하였다. 특히, 상대적으로 운동이 어려운 장애인 사용자를 위한 맞춤형 운동능력 분석 및 해석에 대한 정보 제공이 가능하다.