• Title/Summary/Keyword: Incremental mining

Search Result 44, Processing Time 0.025 seconds

Distributed Incremental Approximate Frequent Itemset Mining Using MapReduce

  • Mohsin Shaikh;Irfan Ali Tunio;Syed Muhammad Shehram Shah;Fareesa Khan Sohu;Abdul Aziz;Ahmad Ali
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.5
    • /
    • pp.207-211
    • /
    • 2023
  • Traditional methods for datamining typically assume that the data is small, centralized, memory resident and static. But this assumption is no longer acceptable, because datasets are growing very fast hence becoming huge from time to time. There is fast growing need to manage data with efficient mining algorithms. In such a scenario it is inevitable to carry out data mining in a distributed environment and Frequent Itemset Mining (FIM) is no exception. Thus, the need of an efficient incremental mining algorithm arises. We propose the Distributed Incremental Approximate Frequent Itemset Mining (DIAFIM) which is an incremental FIM algorithm and works on the distributed parallel MapReduce environment. The key contribution of this research is devising an incremental mining algorithm that works on the distributed parallel MapReduce environment.

IMTAR: Incremental Mining of General Temporal Association Rules

  • Dafa-Alla, Anour F.A.;Shon, Ho-Sun;Saeed, Khalid E.K.;Piao, Minghao;Yun, Un-Il;Cheoi, Kyung-Joo;Ryu, Keun-Ho
    • Journal of Information Processing Systems
    • /
    • v.6 no.2
    • /
    • pp.163-176
    • /
    • 2010
  • Nowadays due to the rapid advances in the field of information systems, transactional databases are being updated regularly and/or periodically. The knowledge discovered from these databases has to be maintained, and an incremental updating technique needs to be developed for maintaining the discovered association rules from these databases. The concept of Temporal Association Rules has been introduced to solve the problem of handling time series by including time expressions into association rules. In this paper we introduce a novel algorithm for Incremental Mining of General Temporal Association Rules (IMTAR) using an extended TFP-tree. The main benefits introduced by our algorithm are that it offers significant advantages in terms of storage and running time and it can handle the problem of mining general temporal association rules in incremental databases by building TFP-trees incrementally. It can be utilized and applied to real life application domains. We demonstrate our algorithm and its advantages in this paper.

An Online Response System for Anomaly Traffic by Incremental Mining with Genetic Optimization

  • Su, Ming-Yang;Yeh, Sheng-Cheng
    • Journal of Communications and Networks
    • /
    • v.12 no.4
    • /
    • pp.375-381
    • /
    • 2010
  • A flooding attack, such as DoS or Worm, can be easily created or even downloaded from the Internet, thus, it is one of the main threats to servers on the Internet. This paper presents an online real-time network response system, which can determine whether a LAN is suffering from a flooding attack within a very short time unit. The detection engine of the system is based on the incremental mining of fuzzy association rules from network packets, in which membership functions of fuzzy variables are optimized by a genetic algorithm. The incremental mining approach makes the system suitable for detecting, and thus, responding to an attack in real-time. This system is evaluated by 47 flooding attacks, only one of which is missed, with no false positives occurring. The proposed online system belongs to anomaly detection, not misuse detection. Moreover, a mechanism for dynamic firewall updating is embedded in the proposed system for the function of eliminating suspicious connections when necessary.

RFM based Incremental Frequent Patterns mining Method for Recommendation in e-Commerce (전자상거래 추천을 위한 RFM기반의 점진적 빈발 패턴 마이닝 기법)

  • Cho, Young Sung;Moon, Song Chul;Ryu, Keun Ho
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2012.07a
    • /
    • pp.135-137
    • /
    • 2012
  • A existing recommedation system using association rules has the problem, which is suffered from inefficiency by reprocessing of the data which have already been processed in the incremental data environment in which new data are added persistently. We propose the recommendation technique using incremental frequent pattern mining based on RFM in e-commerce. The proposed can extract frequent items and create association rules using frequent patterns mining rapidly when new data are added persistently.

  • PDF

Recent Technique Analysis, Infant Commodity Pattern Analysis Scenario and Performance Analysis of Incremental Weighted Maximal Representative Pattern Mining (점진적 가중화 맥시멀 대표 패턴 마이닝의 최신 기법 분석, 유아들의 물품 패턴 분석 시나리오 및 성능 분석)

  • Yun, Unil;Yun, Eunmi
    • Journal of Internet Computing and Services
    • /
    • v.21 no.2
    • /
    • pp.39-48
    • /
    • 2020
  • Data mining techniques have been suggested to find efficiently meaningful and useful information. Especially, in the big data environments, as data becomes accumulated in several applications, related pattern mining methods have been proposed. Recently, instead of analyzing not only static data stored already in files or databases, mining dynamic data incrementally generated in a real time is considered as more interesting research areas because these dynamic data can be only one time read. With this reason, researches of how these dynamic data are mined efficiently have been studied. Moreover, approaches of mining representative patterns such as maximal pattern mining have been proposed since a huge number of result patterns as mining results are generated. As another issue, to discover more meaningful patterns in real world, weights of items in weighted pattern mining have been used, In real situation, profits, costs, and so on of items can be utilized as weights. In this paper, we analyzed weighted maximal pattern mining approaches for data generated incrementally. Maximal representative pattern mining techniques, and incremental pattern mining methods. And then, the application scenarios for analyzing the required commodity patterns in infants are presented by applying weighting representative pattern mining. Furthermore, the performance of state-of-the-art algorithms have been evaluated. As a result, we show that incremental weighted maximal pattern mining technique has better performance than incremental weighted pattern mining and weighted maximal pattern mining.

TFP tree-based Incremental Emerging Patterns Mining for Analysis of Safe and Non-safe Power Load Lines (Safe와 Non-safe 전력 부하 라인 분석을 위한 TFP트리 기반의 점진적 출현패턴 마이닝)

  • Lee, Jong-Bum;Piao, Ming Hao;Ryu, Keun-Ho
    • Spatial Information Research
    • /
    • v.19 no.2
    • /
    • pp.71-76
    • /
    • 2011
  • In this paper, for using emerging patterns to define and analyze the significant difference of safe and non-safe power load lines, and identify which line is potentially non-safe, we proposed an incremental TFP-tree algorithm for mining emerging patterns that can search efficiently within limitation of memory. Especially, the concept of pre-infrequent patterns pruning and use of two different minimum supports, made the algorithm possible to mine most emerging patterns and handle the problem of mining from incrementally increased, large size of data sets such as power consumption data.

Framework for False Alarm Pattern Analysis of Intrusion Detection System using Incremental Association Rule Mining

  • Chon Won Yang;Kim Eun Hee;Shin Moon Sun;Ryu Keun Ho
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.716-718
    • /
    • 2004
  • The false alarm data in intrusion detection systems are divided into false positive and false negative. The false positive makes bad effects on the performance of intrusion detection system. And the false negative makes bad effects on the efficiency of intrusion detection system. Recently, the most of works have been studied the data mining technique for analysis of alert data. However, the false alarm data not only increase data volume but also change patterns of alert data along the time line. Therefore, we need a tool that can analyze patterns that change characteristics when we look for new patterns. In this paper, we focus on the false positives and present a framework for analysis of false alarm pattern from the alert data. In this work, we also apply incremental data mining techniques to analyze patterns of false alarms among alert data that are incremental over the time. Finally, we achieved flexibility by using dynamic support threshold, because the volume of alert data as well as included false alarms increases irregular.

  • PDF

An Efficient Candidate Pattern Storage Tree Structure and Algorithm for Incremental Web Mining (점진적인 웹 마이닝을 위한 효율적인 후보패턴 저장 트리구조 및 알고리즘)

  • Kang, Hee-Seong;Park, Byung-Jun
    • Proceedings of the KIEE Conference
    • /
    • 2006.04a
    • /
    • pp.3-5
    • /
    • 2006
  • Recent advances in the internet infrastructure have resulted in a large number of huge Web sites and portals worldwide. These Web sites are being visited by various types of users in many different ways. Among all the web page access sequences from different users, some of them occur so frequently that may need an attention from those who are interested. We call them frequent access patterns and access sequences that can be frequent the candidate patterns. Since these candidate patterns play an important role in the incremental Web mining, it is important to efficiently generate, add, delete, and search for them. This thesis presents a novel tree structure that can efficiently store the candidate patterns and a related set of algorithms for generating the tree structure adding new patterns, deleting unnecessary patterns, and searching for the needed ones. The proposed tree structure has a kind of the 3 dimensional link structure and its nodes are layered.

  • PDF

A Design of false alarm analysis framework of intrusion detection system by using incremental mining method (점진적 마이닝 기법을 적용한 침입탐지 시스템의 오 경보 분석 프레임워크 설계)

  • Kim Eun-Hee;Ryu Keun-Ho
    • The KIPS Transactions:PartC
    • /
    • v.13C no.3 s.106
    • /
    • pp.295-302
    • /
    • 2006
  • An intrusion detection system writes a lot of alarms against attack behaviors in real time. These alarms contain not only actual attack alarms, but also false alarms that are mistakes made by the intrusion detection system. False alarms are the main reason that reduces the efficiency of the intrusion detection system, and we propose framework for false alarms analysis in the paper. Also, we apply an incremental data mining method for pattern analysis of false alarms increasing continuously. The framework consists of GUI, DB Manager, Alert Preprocessor, and False Alarm Analyzer. We analyze the false alarms increasingly through the experiment of the proposed framework and show that false alarms are reduced by applying the analyzed false alarm rules in the intrusion detection system.

Design and Implementation of Incremental Learning Technology for Big Data Mining

  • Min, Byung-Won;Oh, Yong-Sun
    • International Journal of Contents
    • /
    • v.15 no.3
    • /
    • pp.32-38
    • /
    • 2019
  • We usually suffer from difficulties in treating or managing Big Data generated from various digital media and/or sensors using traditional mining techniques. Additionally, there are many problems relative to the lack of memory and the burden of the learning curve, etc. in an increasing capacity of large volumes of text when new data are continuously accumulated because we ineffectively analyze total data including data previously analyzed and collected. In this paper, we propose a general-purpose classifier and its structure to solve these problems. We depart from the current feature-reduction methods and introduce a new scheme that only adopts changed elements when new features are partially accumulated in this free-style learning environment. The incremental learning module built from a gradually progressive formation learns only changed parts of data without any re-processing of current accumulations while traditional methods re-learn total data for every adding or changing of data. Additionally, users can freely merge new data with previous data throughout the resource management procedure whenever re-learning is needed. At the end of this paper, we confirm a good performance of this method in data processing based on the Big Data environment throughout an analysis because of its learning efficiency. Also, comparing this algorithm with those of NB and SVM, we can achieve an accuracy of approximately 95% in all three models. We expect that our method will be a viable substitute for high performance and accuracy relative to large computing systems for Big Data analysis using a PC cluster environment.