• 제목/요약/키워드: 연관 규칙 알고리즘

Search Result 198, Processing Time 0.046 seconds

A Study of Improving on Test Costs in Decision Trees (Decision Tree의 Test Cost 개선에 관한 연구)

  • 석현태
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.223-225
    • /
    • 2002
  • Decision tree는 목표 데이터에 대한 계층적 관점을 보여준다는 의미에서 데이터를 보다 잘 이해하는데 많은 도움이 되나 탐욕법(greedy algorithm)에 의한 트리 생성법의 한계로 인해 최적의 예측자라고는 할 수가 없다. 이와 같은 약점을 보완하기 위하여 일반적 방법으로 생성한 decision tree에 대하여 다차원 연관규칙 알고리즘을 적용함으로써 짱은 길이의 최적 부분 규칙집합을 구하는 방법을 제시하였고 실험을 통해 그와 같은 사실을 확인하였다.

  • PDF

A Method for Frequent Itemsets Mining from Data Stream (데이터 스트림 환경에서 효율적인 빈발 항목 집합 탐사 기법)

  • Seo, Bok-Il;Kim, Jae-In;Hwang, Bu-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.19D no.2
    • /
    • pp.139-146
    • /
    • 2012
  • Data Mining is widely used to discover knowledge in many fields. Although there are many methods to discover association rule, most of them are based on frequency-based approaches. Therefore it is not appropriate for stream environment. Because the stream environment has a property that event data are generated continuously. it is expensive to store all data. In this paper, we propose a new method to discover association rules based on stream environment. Our new method is using a variable window for extracting data items. Variable windows have variable size according to the gap of same target event. Our method extracts data using COBJ(Count object) calculation method. FPMDSTN(Frequent pattern Mining over Data Stream using Terminal Node) discovers association rules from the extracted data items. Through experiment, our method is more efficient to apply stream environment than conventional methods.

Knowledge Reasoning Model using Association Rules and Clustering Analysis of Multi-Context (다중상황의 군집분석과 연관규칙을 이용한 지식추론 모델)

  • Shin, Dong-Hoon;Kim, Min-Jeong;Oh, SangYeob;Chung, Kyungyong
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.9
    • /
    • pp.11-16
    • /
    • 2019
  • People are subject to time sanctions in a busy modern society. Therefore, people find it difficult to eat simple junk food and even exercise, which is bad for their health. As a result, the incidence of chronic diseases is increasing. Also, the importance of making accurate and appropriate inferences to individual characteristics is growing due to unnecessary information overload phenomenon. In this paper, we propose a knowledge reasoning model using association rules and cluster analysis of multi-contexts. The proposed method provides a personalized healthcare to users by generating association rules based on the clusters based on multi-context information. This can reduce the incidence of each disease by inferring the risk for each disease. In addition, the model proposed by the performance assessment shows that the F-measure value is 0.027 higher than the comparison model, and is highly regarded than the comparison model.

The Goods Recommendation System based on modified FP-Tree Algorithm (변형된 FP-Tree를 기반한 상품 추천 시스템)

  • Kim, Jong-Hee;Jung, Soon-Key
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.11
    • /
    • pp.205-213
    • /
    • 2010
  • This study uses the FP-tree algorithm, one of the mining techniques. This study is an attempt to suggest a new recommended system using a modified FP-tree algorithm which yields an association rule based on frequent 2-itemsets extracted from the transaction database. The modified recommended system consists of a pre-processing module, a learning module, a recommendation module and an evaluation module. The study first makes an assessment of the modified recommended system with respect to the precision rate, recall rate, F-measure, success rate, and recommending time. Then, the efficiency of the system is compared against other recommended systems utilizing the sequential pattern mining. When compared with other recommended systems utilizing the sequential pattern mining, the modified recommended system exhibits 5 times more efficiency in learning, and 20% improvement in the recommending capacity. This result proves that the modified system has more validity than recommended systems utilizing the sequential pattern mining.

Design and Implementation of Rule Discovery Algorithm strongly coupled with Time-series databases (시계열 데이터베이스와 강결합된 규칙발견 알고리즘 설계와 구현)

  • 박인창;김성규
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.43-45
    • /
    • 2001
  • 마이닝 시스템은 그 특성에 따라 매우 다른 형태의 구현 방법이 존재한다. 그러므로 마이닝 시스템간 호환성이나 재사용성은 매우 낮다. 본 노문에서는 이 문제를 시계열 데이터베이스를 통한 RDB와 강 결합함으로써 표준화에 대한 문제를 해겨라고자 시도하였다. RDB와의 강 결합은 표준화 문제를 해결함과 더불어 마이닝 시스템에 DBMS의 관련 기술을 이용함으로써 성능을 극대화시킨다. 특히 DBMS의 인텍스 기능을 이용함으로써 마이닝 시스템의 성능 향상을 시도하였다. 본 논문에서는 기존의 순차패턴 탐사의 시간개념 부재, 트랜잭션 데이터베이스 기반구조, 그리고 알고리즘 수행에 있어서 메모리 한계에 따른 문제등의 단점을 지적하고, 이를 수정하고 보완하기 위해서 시간 거리와 패턴 길이의 개념을 확장하였으며 그에 따른 연관규칙의 관련 공식을 수정 보완하여 제안한다. 또한 RDB와의 강 결합되어 기존의 트랜잭션 데이터베이스 구조를 벗어나 시계열 데이터에 보다 쉽게 적용할 수 있는 절차와 알고리즘을 제안한다.

  • PDF

A Recommendation System of Exponentially Weighted Collaborative Filtering for Products in Electronic Commerce (지수적 가중치를 적용한 협력적 상품추천시스템)

  • Lee, Gyeong-Hui;Han, Jeong-Hye;Im, Chun-Seong
    • The KIPS Transactions:PartB
    • /
    • v.8B no.6
    • /
    • pp.625-632
    • /
    • 2001
  • The electronic stores have realized that they need to understand their customers and to quickly response their wants and needs. To be successful in increasingly competitive Internet marketplace, recommender systems are adapting data mining techniques. One of most successful recommender technologies is collaborative filtering (CF) algorithm which recommends products to a target customer based on the information of other customers and employ statistical techniques to find a set of customers known as neighbors. However, the application of the systems, however, is not very suitable for seasonal products which are sensitive to time or season such as refrigerator or seasonal clothes. In this paper, we propose a new adjusted item-based recommendation generation algorithms called the exponentially weighted collaborative filtering recommendation (EWCFR) one that computes item-item similarities regarding seasonal products. Finally, we suggest the recommendation system with relatively high quality computing time on main memory database (MMDB) in XML since the collaborative filtering systems are needed that can quickly produce high quality recommendations with very large-scale problems.

  • PDF

Web document prediction using forward reference path traversal patterns (전 방향 참조 경로 탐사 패턴을 이용한 웹 문서 예측)

  • 김양규;손기락
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.112-114
    • /
    • 2004
  • 오늘날 웹을 이용하는 사용자들의 웹 검색 형태를 저장한 웹 로그 데이터들은 데이터 마이닝을 위한 중요한 자료가 되고 있다. 이들 웹 로그들로부터 사용자의 현재 행동을 기반으로 사용자가 다음에 요청할 요구를 예측할 수 있는 예측 모델을 만들 수 있다. 하지만 이들 웹 로그들은 크기가 매우 크고 분석하기가 어렵다. 이런 문제를 해결하기 위해 이미 않은 방법이 제안되었다. 그 중에서 효과적으로 예측할 수 있도록 제안된 순차적 분류 기반에 연관법칙을 적용한 예측 기법이 있다. 본 논문에서는 전방향 참조 경로 탐사 패턴 알고리즘을 적용하여 연관규칙에 기반 한 웹 문서 예측 기법을 향상시키는 모델을 제안한다.

  • PDF

A Design and Implementation of Expert Search Engine Using DataMining (데이타마이닝을 이용한 전문 검색엔진의 설계 및 구현)

  • Hwang, Bo-Youn;Kim, Byung-Chan;Kim, Young-Ji;Mun, Hyeong-Jeong;Woo, Yong-Tae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.04a
    • /
    • pp.43-46
    • /
    • 2001
  • 본 논문에서는 데이타마이닝 기법을 이용하여 지능형 전문 검색엔진을 설계하고 사용자 인터페이스를 구현하였다. 먼저, 컴퓨터 분야의 전문 용어에 대하여 연관 규칙 탐사 알고리즘을 이용하여 의미적으로 연관된 용어들끼리 클러스터로 구성하였다. 전문 용어별로 구성된 클러스터는 본 논문에서 제안한 지식베이스 테이블에 저장하여 의미적으로 연관된 용어를 포함하는 웹 문서를 검색하는 과정에서 이용하였다. 검색과정에서는 사용자가 제시한 키워드와 관련된 전문 용어들간의 연관정도를 가중치로 부여하여 연관 정도가 높은 웹 문서순으로 출력하였다. 제안된 방법을 통하여 사용자가 제시한 키워드와 의미적으로 연관된 웹 문서를 효과적으로 검색할 수 있었다.

  • PDF

Analysis of Internet User Features using Multi-dimensional Association Analysis (다차원 연관 분석을 이용한 인터넷 이용자의 특징 분석)

  • Lee, Su-Eun;Jung, Yong-Gyu
    • Journal of Service Research and Studies
    • /
    • v.1 no.1
    • /
    • pp.61-69
    • /
    • 2011
  • Data mining that can not be extracted with a simple query in the form of "useful" means to find information in large databases from the existing and unknown knowledge. It is based on this insight about the data can be defined as a gain. In this paper, we use the Internet to find useful patterns on the Web or saved data to the target Web site, which is to analyze the characteristics of users. A general statistical information on Internet users to the data by applying a relevance analysis, Internet use affect the amount of time to analyze the characteristics of Internet users. Only through experiments extracting data from the association rules, producing optimal results apply for the data pre-processing and algorithm for mining the Web to Internet users. characteristics were analyzed.

  • PDF

Network Anomaly Detection based on Association among Packets (패킷간 연관 관계를 이용한 네트워크 비정상행위 탐지)

  • 오상현;이원석
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.5
    • /
    • pp.63-73
    • /
    • 2002
  • Recently, intrusions into a computer have been increased rapidly and also various intrusion methods have been developed. As a result. many researches have been performed to detect the activities of intruders effectively In this paper, a new association mining algorithm for anomaly network intrusion detection is proposed. For this purpose, the proposed algorithm is composed of two different phases: intra-packet association and inter-packet association. The performance of the proposed anomaly detection system is evaluated based on several experiment according to various system parameters in order to identify their practical ranges for maximizing its detection rate. As a result, an anomaly can be detected effectively.