• Title/Summary/Keyword: Apriori algorithms

Search Result 19, Processing Time 0.032 seconds

An Algorithm for reducing the search time of Frequent Items (빈발 항목의 탐색 시간을 단축하기 위한 알고리즘)

  • Yun, So-Young;Youn, Sung-Dae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.1
    • /
    • pp.147-156
    • /
    • 2011
  • With the increasing utility of the recent information system, the methods to pick up necessary products rapidly by using a lot of data has been studied. Association rule search methods to find hidden patterns has been drawing much attention, and the Apriori algorithm is a major method. However, the Apriori algorithm increases search time due to its repeated scans. This paper proposes an algorithm to reduce searching time of frequent items. The proposed algorithm creates matrix using transaction database and search for frequent items using the mean number of items of transactions at matrix and a defined minimum support. The mean number of items of transactions is used to reduce the number of transactions, and the minimum support to cut down on items. The performance of the proposed algorithm is assessed by the comparison of search time and precision with existing algorithms. The findings from this study indicated that the proposed algorithm has been searched more quickly and efficiently when extracting final frequent items, compared to existing Apriori and Matrix algorithm.

Association Rule Mining Considering Strategic Importance (전략적 중요도를 고려한 연관규칙 탐사)

  • Choi, Doug-Won;Shin, Jin-Gyu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.05a
    • /
    • pp.443-446
    • /
    • 2007
  • A new association rule mining algorithm, which reflects the strategic importance of associative relationships between items, was developed and presented in this paper. This algorithm exploits the basic framework of Apriori procedures and TSAA(transitive support association Apriori) procedure developed by Hyun and Choi in evaluating non-frequent itemsets. The algorithm considers the strategic importance(weight) of feature variables in the association rule mining process. Sample feature variables of strategic importance include: profitability, marketing value, customer satisfaction, and frequency. A database with 730 transaction data set of a large scale discount store was used to compare and verify the performance of the presented algorithm against the existing Apriori and TSAA algorithms. The result clearly indicated that the new algorithm produced substantially different association itemsets according to the weights assigned to the strategic feature variables.

  • PDF

Association Rule Discovery Considering Strategic Importance: WARM (전략적 중요도를 고려한 연관규칙의 발견: WARM)

  • Choi, Doug-Won
    • The KIPS Transactions:PartD
    • /
    • v.17D no.4
    • /
    • pp.311-316
    • /
    • 2010
  • This paper presents a weight adjusted association rule mining algorithm (WARM). Assigning weights to each strategic factor and normalizing raw scores within each strategic factor are the key ideas of the presented algorithm. It is an extension of the earlier algorithm TSAA (transitive support association Apriori) and strategic importance is reflected by considering factors such as profit, marketing value, and customer satisfaction of each item. Performance analysis based on a real world database has been made and comparison of the mining outcomes obtained from three association rule mining algorithms (Apriori, TSAA, and WARM) is provided. The result indicates that each algorithm gives distinct and characteristic behavior in association rule mining.

Development of Recommendation Agents through Web Log Analysis (웹 로그 분석을 이용한 추천 에이전트의 개발)

  • 김성학;이창훈
    • Journal of the Korea Computer Industry Society
    • /
    • v.4 no.10
    • /
    • pp.621-630
    • /
    • 2003
  • Web logs are the information recorded by a web server when users access the web sites, and due to a speedy rising of internet usage, the worth of their practical use has become increasingly important. Analyzing such logs can use to determine the patterns representing users' navigational behavior in a Web site and restructure a Web site to create a more effective organizational presence. For these applications, the generally used key methods in many studies are association rules and sequential patterns based by Apriori algorithms, which are widely used to extract correlation among patterns. But Apriori inhere inefficiency in computing cost when applied to large databases. In this paper, we develop a new algorithm for mining interesting patterns which is faster than Apriori algorithm and recommendation agents which could provide a system manager with valuable information that are accessed sequentially by many users.

  • PDF

Frequently Occurred Information Extraction from a Collection of Labeled Trees (라벨 트리 데이터의 빈번하게 발생하는 정보 추출)

  • Paik, Ju-Ryon;Nam, Jung-Hyun;Ahn, Sung-Joon;Kim, Ung-Mo
    • Journal of Internet Computing and Services
    • /
    • v.10 no.5
    • /
    • pp.65-78
    • /
    • 2009
  • The most commonly adopted approach to find valuable information from tree data is to extract frequently occurring subtree patterns from them. Because mining frequent tree patterns has a wide range of applications such as xml mining, web usage mining, bioinformatics, and network multicast routing, many algorithms have been recently proposed to find the patterns. However, existing tree mining algorithms suffer from several serious pitfalls in finding frequent tree patterns from massive tree datasets. Some of the major problems are due to (1) modeling data as hierarchical tree structure, (2) the computationally high cost of the candidate maintenance, (3) the repetitious input dataset scans, and (4) the high memory dependency. These problems stem from that most of these algorithms are based on the well-known apriori algorithm and have used anti-monotone property for candidate generation and frequency counting in their algorithms. To solve the problems, we base a pattern-growth approach rather than the apriori approach, and choose to extract maximal frequent subtree patterns instead of frequent subtree patterns. The proposed method not only gets rid of the process for infrequent subtrees pruning, but also totally eliminates the problem of generating candidate subtrees. Hence, it significantly improves the whole mining process.

  • PDF

A Store Recommendation Procedure in Ubiquitous Market (U-마켓에서의 매장 추천방법)

  • Kim, Jae-Kyeong;Chae, Kyung-Hee;Kim, Min-Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.4
    • /
    • pp.45-63
    • /
    • 2007
  • Recently as ubiquitous environment comes to the fore, information density is raised and enterprise is being able to capture and utilize customer-related information at the same time when the customer purchases a product. In this environment, a need for the recommender systems which can deliver proper information to the customer at the right time and right situation is highly increased. Therefore, the research on recommender systems continued actively in a variety of fields. Until now, most of recommender systems deal with item recommendation. However, in the market in ubiquitous environment where the same item can be purchased at several stores, it is highly desirable to recommend store to the customer based on his/her contextual situation and preference such as store location, store atmosphere, product quality and price, etc. In this line of research, we proposed the store recommender system using customer's contextual situation and preference in the market in ubiquitous environment. This system is based on collaborative filtering and Apriori algorithms. It will be able to provide customer-centric service to the customer, enhance shopping experiences and contribute in revitalizing market in the long term.

  • PDF

A New Association Rule Mining based on Coverage and Exclusion for Network Intrusion Detection (네트워크 침입 탐지를 위한 Coverage와 Exclusion 기반의 새로운 연관 규칙 마이닝)

  • Tae Yeon Kim;KyungHyun Han;Seong Oun Hwang
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.1
    • /
    • pp.77-87
    • /
    • 2023
  • Applying various association rule mining algorithms to the network intrusion detection task involves two critical issues: too large size of generated rule set which is hard to be utilized for IoT systems and hardness of control of false negative/positive rates. In this research, we propose an association rule mining algorithm based on the newly defined measures called coverage and exclusion. Coverage shows how frequently a pattern is discovered among the transactions of a class and exclusion does how frequently a pattern is not discovered in the transactions of the other classes. We compare our algorithm experimentally with the Apriori algorithm which is the most famous algorithm using the public dataset called KDDcup99. Compared to Apriori, the proposed algorithm reduces the resulting rule set size by up to 93.2 percent while keeping accuracy completely. The proposed algorithm also controls perfectly the false negative/positive rates of the generated rules by parameters. Therefore, network analysts can effectively apply the proposed association rule mining to the network intrusion detection task by solving two issues.

Apriori Based Big Data Processing System for Improve Sensor Data Throughput in IoT Environments (IoT 환경에서 센서 데이터 처리율 향상을 위한 Apriori 기반 빅데이터 처리 시스템)

  • Song, Jin Su;Kim, Soo Jin;Shin, Young Tae
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.10
    • /
    • pp.277-284
    • /
    • 2021
  • Recently, the smart home environment is expected to be a platform that collects, integrates, and utilizes various data through convergence with wireless information and communication technology. In fact, the number of smart devices with various sensors is increasing inside smart homes. The amount of data that needs to be processed by the increased number of smart devices is also increasing, and big data processing systems are actively being introduced to handle it effectively. However, traditional big data processing systems have all requests directed to cluster drivers before they are allocated to distributed nodes, leading to reduced cluster-wide performance sharing as cluster drivers managing segmentation tasks become bottlenecks. In particular, there is a greater delay rate on smart home devices that constantly request small data processing. Thus, in this paper, we design a Apriori-based big data system for effective data processing in smart home environments where frequent requests occur at the same time. According to the performance evaluation results of the proposed system, the data processing time was reduced by up to 38.6% from at least 19.2% compared to the existing system. The reason for this result is related to the type of data being measured. Because the amount of data collected in a smart home environment is large, the use of cache servers plays a major role in data processing, and association analysis with Apriori algorithms stores highly relevant sensor data in the cache.

An analysis of students' online class preference depending on the gender and levels of school using Apriori Algorithm (Apriori 알고리즘을 활용한 학습자의 성별과 학교급에 따른 온라인 수업 유형 선호도 분석)

  • Kim, Jinhee;Hwang, Doohee;Lee, Sang-Soog
    • Journal of Digital Convergence
    • /
    • v.20 no.1
    • /
    • pp.33-39
    • /
    • 2022
  • This study aims to investigate the online class preference depending on students' gender and school level. To achieve this aim, the study conducted a survey on 4,803 elementary, middle, and high school students in 17 regions nationwide. The valid data of 4,524 were then analyzed using the Apriori algorithm to discern the associated patterns of the online class preference corresponding to their gender and school level. As a result, a total of 16 rules, including 7 from elementary school students, 4 from middle school students, and 5 from high school students were derived. To be specific, elementary school male students preferred software-based classes whereas elementary female students preferred maker-based classes. In the case of middle school, both male and female students preferred virtual experience-based classes. On the other hand, high school students had a higher preference for subject-specific lecture-based classes. The study findings can serve as empirical evidence for explaining the needs of online classes perceived by K-12 students. In addition, this study can be used as basic research to present and suggest areas of improvement for diversifying online classes. Future studies can further conduct in-depth analysis on the development of various online class activities and models, the design of online class platforms, and the female students' career motivation in the field of science and technology.

An analysis of operation status depending on the characteristics of R&D projects in Sciences and Engineering universities (이공계 대학 연구과제 특성 별 운영 형태 현황)

  • Lee, Sang-Soog;Yoo, Inhyeok;Kim, Jinhee
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.93-100
    • /
    • 2022
  • This study aimed to understand the current status of science and engineering university(SEU) R&D operations depending on the research project characteristics(e.g., stages and characteristics), then provide implications for future university R&D support systems and related policies. Hence, an online survey targeting SEU R&D recipients was conducted between October 4th to November 5th, 2021. Analyzing 445 valid data using the Apriori algorithm, 16 association rules for R&D operation according to the research project characteristics show that regardless of research characteristics, SEU's R&D projects, particularly in applied research, were funded or operated under the leadership of government or public institutions. For basic research, individual researchers had a higher level of autonomy in determining research topics; yet, they had a short duration (3 years) and a unit of evaluation period of more than 3 years. These findings can be empirical evidence for revealing the relationship among various variables in operating SEUs' R&D.