• Title/Summary/Keyword: 빈발도

Search Result 465, Processing Time 0.03 seconds

Clustering Algorithm using the DFP-Tree based on the MapReduce (맵리듀스 기반 DFP-Tree를 이용한 클러스터링 알고리즘)

  • Seo, Young-Won;Kim, Chang-soo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.23-30
    • /
    • 2015
  • As BigData is issued, many applications that operate based on the results of data analysis have been developed, typically applications are products recommend service of e-commerce application service system, search service on the search engine service and friend list recommend system of social network service. In this paper, we suggests a decision frequent pattern tree that is combined the origin frequent pattern tree that is mining similar pattern to appear in the data set of the existing data mining techniques and decision tree based on the theory of computer science. The decision frequent pattern tree algorithm improves about problem of frequent pattern tree that have to make some a lot's pattern so it is to hard to analyze about data. We also proposes to model for a Mapredue framework that is a programming model to help to operate in distributed environment.

An Efficient Hashing Mechanism of the DHP Algorithm for Mining Association Rules (DHP 연관 규칙 탐사 알고리즘을 위한 효율적인 해싱 메카니즘)

  • Lee, Hyung-Bong
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5 s.108
    • /
    • pp.651-660
    • /
    • 2006
  • Algorithms for mining association rules based on the Apriori algorithm use the hash tree data structure for storing and counting supports of the candidate frequent itemsets and the most part of the execution time is consumed for searching in the hash tree. The DHP(Direct Hashing and Pruning) algorithm makes efforts to reduce the number of the candidate frequent itemsets to save searching time in the hash tree. For this purpose, the DHP algorithm does preparative simple counting supports of the candidate frequent itemsets. At this time, the DHP algorithm uses the direct hash table to reduce the overhead of the preparative counting supports. This paper proposes and evaluates an efficient hashing mechanism for the direct hash table $H_2$ which is for pruning in phase 2 and the hash tree $C_k$, which is for counting supports of the candidate frequent itemsets in all phases. The results showed that the performance improvement due to the proposed hashing mechanism was 82.2% on the maximum and 18.5% on the average compared to the conventional method using a simple mod operation.

Discovery of Frequent Sequence Pattern in Moving Object Databases (이동 객체 데이터베이스에서 빈발 시퀀스 패턴 탐색)

  • Vu, Thi Hong Nhan;Lee, Bum-Ju;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.179-186
    • /
    • 2008
  • The converge of location-aware devices, GIS functionalities and the increasing accuracy and availability of positioning technologies pave the way to a range of new types of location-based services. The field of spatiotemporal data mining where relationships are defined by spatial and temporal aspect of data is encountering big challenges since the increased search space of knowledge. Therefore, we aim to propose algorithms for mining spatiotemporal patterns in mobile environment in this paper. Moving patterns are generated utilizing two algorithms called All_MOP and Max_MOP. The first one mines all frequent patterns and the other discovers only maximal frequent patterns. Our proposed approach is able to reduce consuming time through comparison with DFS_MINE algorithm. In addition, our approach is applicable to location-based services such as tourist service, traffic service, and so on.

Extracting optimal moving patterns of edge devices for efficient resource placement in an FEC environment (FEC 환경에서 효율적 자원 배치를 위한 엣지 디바이스의 최적 이동패턴 추출)

  • Lee, YonSik;Nam, KwangWoo;Jang, MinSeok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.162-169
    • /
    • 2022
  • In a dynamically changing time-varying network environment, the optimal moving pattern of edge devices can be applied to distributing computing resources to edge cloud servers or deploying new edge servers in the FEC(Fog/Edge Computing) environment. In addition, this can be used to build an environment capable of efficient computation offloading to alleviate latency problems, which are disadvantages of cloud computing. This paper proposes an algorithm to extract the optimal moving pattern by analyzing the moving path of multiple edge devices requiring application services in an arbitrary spatio-temporal environment based on frequency. A comparative experiment with A* and Dijkstra algorithms shows that the proposed algorithm uses a relatively fast execution time and less memory, and extracts a more accurate optimal path. Furthermore, it was deduced from the comparison result with the A* algorithm that applying weights (preference, congestion, etc.) simultaneously with frequency can increase path extraction accuracy.

A Method of Frequent Structure Detection Based on Active Sliding Window (능동적 슬라이딩 윈도우 기반 빈발구조 탐색 기법)

  • Hwang, Jeong-Hee
    • Journal of Digital Contents Society
    • /
    • v.13 no.1
    • /
    • pp.21-29
    • /
    • 2012
  • In ubiquitous computing environment, rising large scale data exchange through sensor network with sharply growing the internet, the processing of the continuous stream data is required. Therefore there are some mining researches related to the extracting of frequent structures and the efficient query processing of XML stream data. In this paper, we propose a mining method to extract frequent structures of XML stream data in recent window based on the active window sliding using trigger rule. The proposed method is a basic research to control the stream data flow for data mining and continuous query by trigger rules.

Spatial-Temporal Moving Sequence Pattern Mining (시공간 이동 시퀀스 패턴 마이닝 기법)

  • Han, Seon-Young;Yong, Hwan-Seung
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.3
    • /
    • pp.599-617
    • /
    • 2006
  • Recently many LBS(Location Based Service) systems are issued in mobile computing systems. Spatial-Temporal Moving Sequence Pattern Mining is a new mining method that mines user moving patterns from user moving path histories in a sensor network environment. The frequent pattern mining is related to the items which customers buy. But on the other hand, our mining method concerns users' moving sequence paths. In this paper, we consider the sequence of moving paths so we handle the repetition of moving paths. Also, we consider the duration that user spends on the location. We proposed new Apriori_msp based on the Apriori algorithm and evaluated its performance results.

Clustering XML Documents Considering The Weight of Large Items in Clusters (클러스터의 주요항목 가중치 기반 XML 문서 클러스터링)

  • Hwang, Jeong-Hee
    • The KIPS Transactions:PartD
    • /
    • v.14D no.1 s.111
    • /
    • pp.1-8
    • /
    • 2007
  • As the web document of XML, an exchange language of data in the advanced Internet, is increasing, a target of information retrieval becomes the web documents. Therefore, there we researches on structure, integration and retrieval of XML documents. This paper proposes a clustering method of XML documents based on frequent structures, as a basic research to efficiently process query and retrieval. To do so, first, trees representing XML documents are decomposed and we extract frequent structures from them. Second, we perform clustering considering the weight of large items to adjust cluster creation and cluster cohesion, considering frequent structures as items of transactions. Third, we show the excellence of our method through some experiments which compare which the previous methods.

Discovery of Frequent Traversal Patterns from Weighted Traversals and Performance Enhancement by Traversal Split (가중치 순회로부터 빈발 순회패턴의 탐사 및 순회분할을 통한 성능향상)

  • Lee, Seong-Dae;Park, Hyu-Chan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.5
    • /
    • pp.940-948
    • /
    • 2007
  • Many real world problems can be modeled as a graph and traversals on the graph. The structure of Web pages can be represented as a graph, for example, and user's navigation paths on the Web pages can be model as a traversal on the graph. It is interesting to discover valuable patterns, such as frequent patterns, from such traversals. In this paper, we propose an algorithm to discover frequent traversal patterns when a directed graph and weighted traversals on the graph are given. Furthermore, we propose a performance enhancement by traversal split and then verify it through experiments.

A Comparative Study on Feature Selection and Classification Methods Using Closed Frequent Patterns Mining (닫힌 빈발 패턴을 기반으로 한 특징 선택과 분류방법 비교)

  • Zhang, Lei;Jin, Cheng Hao;Ryu, Keun Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.148-151
    • /
    • 2010
  • 분류 기법은 데이터 마이닝 기술 중 가장 잘 알려진 방법으로서, Decision tree, SVM(Support Vector Machine), ANN(Artificial Neural Network) 등 기법을 포함한다. 분류 기법은 이미 알려진 상호 배반적인 몇 개 그룹에 속하는 다변량 관측치로부터 각각의 그룹이 어떤 특징을 가지고 있는지 분류 모델을 만들고, 소속 그룹이 알려지지 않은 새로운 관측치가 어떤 그룹에 분류될 것인가를 결정하는 분석 방법이다. 분류기법을 수행할 때에 기본적으로 특징 공간이 잘 표현되어 있다고 가정한다. 그러나 실제 응용에서는 단일 특징으로 구성된 특징공간이 분명하지 않기 때문에 분류를 잘 수행하지 못하는 문제점이 있다. 본 논문에서는 이 문제에 대한 해결방안으로써 많은 정보를 포함하면서 빈발패턴에 대한 정보의 순실이 없는 닫힌 빈발패턴 기반 분류에 대한 연구를 진행하였다. 본 실험에서는 ${\chi}^2$(Chi-square)과 정보이득(Information Gain) 속성 선택 척도를 사용하여 의미있는 특징 선택을 수행하였다. 그 결과, 이 연구에서 제시한 척도를 사용하여 특징 선택을 수행한 경우, C4.5, SVM 과 같은 분류기법보다 더 향상된 분류 성능을 보였다.

Discovering Association Rules using Item Clustering on Frequent Pattern Network (빈발 패턴 네트워크에서 아이템 클러스터링을 통한 연관규칙 발견)

  • Oh, Kyeong-Jin;Jung, Jin-Guk;Ha, In-Ay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.1
    • /
    • pp.1-17
    • /
    • 2008
  • Data mining is defined as the process of discovering meaningful and useful pattern in large volumes of data. In particular, finding associations rules between items in a database of customer transactions has become an important thing. Some data structures and algorithms had been proposed for storing meaningful information compressed from an original database to find frequent itemsets since Apriori algorithm. Though existing method find all association rules, we must have a lot of process to analyze association rules because there are too many rules. In this paper, we propose a new data structure, called a Frequent Pattern Network (FPN), which represents items as vertices and 2-itemsets as edges of the network. In order to utilize FPN, We constitute FPN using item's frequency. And then we use a clustering method to group the vertices on the network into clusters so that the intracluster similarity is maximized and the intercluster similarity is minimized. We generate association rules based on clusters. Our experiments showed accuracy of clustering items on the network using confidence, correlation and edge weight similarity methods. And We generated association rules using clusters and compare traditional and our method. From the results, the confidence similarity had a strong influence than others on the frequent pattern network. And FPN had a flexibility to minimum support value.

  • PDF