• 제목/요약/키워드: data pattern

검색결과 8,428건 처리시간 0.035초

Implementation of Subsequence Mapping Method for Sequential Pattern Mining

  • Trang Nguyen Thu;Lee Bum-Ju;Lee Heon-Gyu;Park Jeong-Seok;Ryu Keun-Ho
    • 대한원격탐사학회지
    • /
    • 제22권5호
    • /
    • pp.457-462
    • /
    • 2006
  • Sequential Pattern Mining is the mining approach which addresses the problem of discovering the existent maximal frequent sequences in a given databases. In the daily and scientific life, sequential data are available and used everywhere based on their representative forms as text, weather data, satellite data streams, business transactions, telecommunications records, experimental runs, DNA sequences, histories of medical records, etc. Discovering sequential patterns can assist user or scientist on predicting coming activities, interpreting recurring phenomena or extracting similarities. For the sake of that purpose, the core of sequential pattern mining is finding the frequent sequence which is contained frequently in all data sequences. Beside the discovery of frequent itemsets, sequential pattern mining requires the arrangement of those itemsets in sequences and the discovery of which of those are frequent. So before mining sequences, the main task is checking if one sequence is a subsequence of another sequence in the database. In this paper, we implement the subsequence matching method as the preprocessing step for sequential pattern mining. Matched sequences in our implementation are the normalized sequences as the form of number chain. The result which is given by this method is the review of matching information between input mapped sequences.

패턴사전과 비정형성을 통한 이상치 탐지방법 적용 (Anomaly Detection via Pattern Dictionary Method and Atypicality in Application)

  • 오세홍;박종성;윤영삼
    • 센서학회지
    • /
    • 제32권6호
    • /
    • pp.481-486
    • /
    • 2023
  • Anomaly detection holds paramount significance across diverse fields, encompassing fraud detection, risk mitigation, and sensor evaluation tests. Its pertinence extends notably to the military, particularly within the Warrior Platform, a comprehensive combat equipment system with wearable sensors. Hence, we propose a data-compression-based anomaly detection approach tailored to unlabeled time series and sequence data. This method entailed the construction of two distinctive features, typicality and atypicality, to discern anomalies effectively. The typicality of a test sequence was determined by evaluating the compression efficacy achieved through the pattern dictionary. This dictionary was established based on the frequency of all patterns identified in a training sequence generated for each sensor within Warrior Platform. The resulting typicality served as an anomaly score, facilitating the identification of anomalous data using a predetermined threshold. To improve the performance of the pattern dictionary method, we leveraged atypicality to discern sequences that could undergo compression independently without relying on the pattern dictionary. Consequently, our refined approach integrated both typicality and atypicality, augmenting the effectiveness of the pattern dictionary method. Our proposed method exhibited heightened capability in detecting a spectrum of unpredictable anomalies, fortifying the stability of wearable sensors prevalent in military equipment, including the Army TIGER 4.0 system.

A Comparison of Clustering Algorithm in Data Mining

  • Lee, Yung-Seop;An, Mi-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권4호
    • /
    • pp.725-736
    • /
    • 2003
  • To provide the information needed to make a decision, it is important to know the relationship or pattern between variables in database. Grouping objects which have similar characteristics of pattern is called as cluster analysis, one of data mining techniques. In this study, it is compared with several partitioning clustering algorithms, based on the statistical distance or total variance in each cluster.

  • PDF

Development of an Event Stream Processing System for the Vehicle Telematics Environment

  • Kim, Jong-Ik;Kwon, Oh-Cheon;Kim, Hyun-Suk
    • ETRI Journal
    • /
    • 제31권4호
    • /
    • pp.463-465
    • /
    • 2009
  • In this letter, we present an event stream processing system that can evaluate a pattern query for a data sequence with predicates. We propose a pattern query language and develop a pattern query processing system. In our system, we propose novel techniques for run-time aggregation and negation processing and apply our system to stream data generated from vehicles to monitor unusual driving patterns.

File Modification Pattern Detection Mechanism Using File Similarity Information

  • Jung, Ho-Min;Ko, Yong-Woong
    • International journal of advanced smart convergence
    • /
    • 제1권1호
    • /
    • pp.34-37
    • /
    • 2012
  • In a storage system, the performance of data deduplication can be increased if we consider the file modification pattern. For example, if a file is modified at the end of file region then fixed-length chunking algorithm superior to variable-length chunking. Therefore, it is important to predict in which location of a file is modified between files. In this paper, the essential idea is to exploit an efficient file pattern checking scheme that can be used for data deduplication system. The file modification pattern can be used for elaborating data deduplication system for selecting deduplication algorithm. Experiment result shows that the proposed system can predict file modification region with high probability.

The classified method for overlapping data

  • Kruatrachue, Boontee;Warunsin, Kulwarun;Siriboon, Kritawan
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2004년도 ICCAS
    • /
    • pp.2037-2040
    • /
    • 2004
  • In this paper we introduce a new prototype based classifiers for overlapping data, where training pattern can be overlap on the feature space. The proposed classifier is based on the prototype from neural network classifier (NNC)[1] for overlap data. The method automatically chooses the initial center and two radiuses for each class. The center is used as a mean representative of training data for each class. The unclassified pattern is classified by measure distance from the class center. If the distance is in the lower (shorter radius) the unknown pattern has the high percentage of being in this class. If the distance is between the lower and upper (further radius), the pattern has the probability of being in this class or others. But if the distance is outside the upper, the pattern is not in this class. We borrow the words upper and lower from the rough set to represent the region of certainty [3]. The training algorithm to find number of cluster and their parameters (center, lower, upper) is presented. The clustering result is tested using patterns from Thai handwritten letter and the clustering result is very similar to human eyes clustering.

  • PDF

콜레스테롤 자료에 대한 적정 공분산행렬 형태 산출에 관한 통계적 분석 (A statistical analysis on the selection of the optimal covariance matrix pattern for the cholesterol data)

  • 조진남;백재욱
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권6호
    • /
    • pp.1263-1270
    • /
    • 2010
  • 60명의 환자들을 20명씩3개 그룹으로 나누어 각 그룹마다 다른 종류의 식이요법을 실시한 후 1주 간격으로 5주간에 걸쳐서 콜레스테롤 수치에 대한 반복측정 자료를 얻었다. 해당자료를 바탕으로 적합성여부와 유의성 검정을 실시한 결과 등분산 Toeplitz가 다양한 공분산행렬 형태들 중에서 가장 적합한 공분산구조 모형으로 판명되었다. 이 모형에서는 시점들 간의 상관계수는 0.64-0.78로 대체적으로 높은 상관관계들을 보여주고 있으며, 모수인자들의 유의성검정 결과, 시간효과는 대단히 유의하게 나타났으나, 처리 및 처리와 시간과의 교호작용효과는 유의하지 않은 것으로 판명되었다.

클러스터링 기법을 이용한 수용가별 전력 데이터 패턴 분석 (Customer Load Pattern Analysis using Clustering Techniques)

  • 유승형;김홍석;오도은;노재구
    • KEPCO Journal on Electric Power and Energy
    • /
    • 제2권1호
    • /
    • pp.61-69
    • /
    • 2016
  • Understanding load patterns and customer classification is a basic step in analyzing the behavior of electricity consumers. To achieve that, there have been many researches about clustering customers' daily load data. Nowadays, the deployment of advanced metering infrastructure (AMI) and big-data technologies make it easier to study customers' load data. In this paper, we study load clustering from the view point of yearly and daily load pattern. We compare four clustering methods; K-means clustering, hierarchical clustering (average & Ward's method) and DBSCAN (Density-Based Spatial Clustering of Applications with Noise). We also discuss the relationship between clustering results and Korean Standard Industrial Classification that is one of possible labels for customers' load data. We find that hierarchical clustering with Ward's method is suitable for clustering load data and KSIC can be well characterized by daily load pattern, but not quite well by yearly load pattern.

The fashion consumer purchase patterns and influencing factors through big data - Based on sequential pattern analysis -

  • Ki Yong Kwon
    • 복식문화연구
    • /
    • 제31권5호
    • /
    • pp.607-626
    • /
    • 2023
  • This study analyzes consumer fashion purchase patterns from a big data perspective. Transaction data from 1 million transactions at two Korean fashion brands were collected. To analyze the data, R, Python, the SPADE algorithm, and network analysis were used. Various consumer purchase patterns, including overall purchase patterns, seasonal purchase patterns, and age-specific purchase patterns, were analyzed. Overall pattern analysis found that a continuous purchase pattern was formed around the brands' popular items such as t-shirts and blouses. Network analysis also showed that t-shirts and blouses were highly centralized items. This suggests that there are items that make consumers loyal to a brand rather than the cachet of the brand name itself. These results help us better understand the process of brand equity construction. Additionally, buying patterns varied by season, and more items were purchased in a single shopping trip during the spring season compared to other seasons. Consumer age also affected purchase patterns; findings showed an increase in purchasing the same item repeatedly as age increased. This likely reflects the difference in purchasing power according to age, and it suggests that the decision-making process for pur- chasing products simplifies as age increases. These findings offer insight for fashion companies' establishment of item-specific marketing strategies.

한국인을 위한 장갑 패턴 고찰 (1) - 업체 조사를 통한 손계측 항목을 중심으로 - (A Study on the Measurement of Korean Hand - Focusing on Glove & Hand Dimension -)

  • 류경옥
    • 복식문화연구
    • /
    • 제17권5호
    • /
    • pp.866-877
    • /
    • 2009
  • The purpose of this study was to develop the dimension of hand pattern-making for Korean glove. The glove pattern-making has difficult problem in combination of anthropometric and engineering aspects. In addition, existing dimension data are not enough for glove pattern-making. Therefore, to develop the dimension for glove this study comprehensive list of candidate hand data was reviewed and the manufacturers(career over the 15 years) were interviewed on the method of glove. The result of comparing between the structures in hand and existing glove pattern, there draw deduction from follows. Pattern-making for glove need size of hand length, thumb length, index finger length, middle finger length, ring finger length, hand circumference, thumb-ring finger circumference and maximum hand thickness.

  • PDF