• Title/Summary/Keyword: Tree mining

Search Result 566, Processing Time 0.026 seconds

A Comparative Study on the Performance of Intrusion Detection using Decision Tree and Artificial Neural Network Models (의사결정트리와 인공 신경망 기법을 이용한 침입탐지 효율성 비교 연구)

  • Jo, Seongrae;Sung, Haengnam;Ahn, Byunghyuk
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.11 no.4
    • /
    • pp.33-45
    • /
    • 2015
  • Currently, Internet is used an essential tool in the business area. Despite this importance, there is a risk of network attacks attempting collection of fraudulence, private information, and cyber terrorism. Firewalls and IDS(Intrusion Detection System) are tools against those attacks. IDS is used to determine whether a network data is a network attack. IDS analyzes the network data using various techniques including expert system, data mining, and state transition analysis. This paper tries to compare the performance of two data mining models in detecting network attacks. They are decision tree (C4.5), and neural network (FANN model). I trained and tested these models with data and measured the effectiveness in terms of detection accuracy, detection rate, and false alarm rate. This paper tries to find out which model is effective in intrusion detection. In the analysis, I used KDD Cup 99 data which is a benchmark data in intrusion detection research. I used an open source Weka software for C4.5 model, and C++ code available for FANN model.

J48 and ADTree for forecast of leaving of hospitals

  • Halim, Faisal;Muttaqin, Rizal
    • Korean Journal of Artificial Intelligence
    • /
    • v.4 no.1
    • /
    • pp.11-13
    • /
    • 2016
  • These days, medical technology has been developed rapidly to meet desire of living healthy life. Average lifespan was extended to let people see a doctor because of many reasons. This study has shown rate of leaving of hospitals to investigate the rate of not only department of surgery but also department of internal medicine. Linear model, tree, classification rule, association and algorithm of data mining were used. This study investigated by using J48 and AD tree of decision-making tree In this study, J48 and AD tree of decision-making tree of data mining were used to investigate based on result of both data. Both algorithms were found to have similar performance. Both algorithms were not equivalent to require detailed experiment. Collect more experimental data in the future to apply from various points of view. Development of medical technology gives dream, hope and pleasure. The ones who suffer from incurable diseases need developed medical technology. Environment being similar to the reality shall be made to experiment exactly to investigate data carefully and to let the ones of various ages visit hospital and to increase survival rate.

Decision Tree Classifier for Multiple Abstraction Levels of Data (다중 추상화 수준의 데이터를 위한 결정 트리 분류기)

  • Jeong, Min-A;Lee, Do-Heon
    • The KIPS Transactions:PartD
    • /
    • v.10D no.1
    • /
    • pp.23-32
    • /
    • 2003
  • Since the data is collected from disparate sources in many actual data mining environments, it is common to have data values in different abstraction levels. This paper shows that such multiple abstraction levels of data can cause undesirable effects in decision tree classification. After explaining that equalizing abstraction levels by force cannot provide satisfactory solutions of this problem, it presents a method to utilize the data as it is. The proposed method accommodates the generalization/specialization relationship between data values in both of the construction and the class assignment phase of decision tree classification. The experimental results show that the proposed method reduces classification error rates significantly when multiple abstraction levels of data are involved.

Memory Improvement Method for Extraction of Frequent Patterns in DataBase (데이터베이스에서 빈발패턴의 추출을 위한 메모리 향상기법)

  • Park, In-Kyu
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.2
    • /
    • pp.127-133
    • /
    • 2019
  • Since frequent item extraction so far requires searching for patterns and traversal for the FP-Tree, it is more likely to store the mining data in a tree and thus CPU time is required for its searching. In order to overcome these drawbacks, in this paper, we provide each item with its location identification of transaction data without relying on conditional FP-Tree and convert transaction data into 2-dimensional position information look-up table, resulting in the facilitation of time and spatial accessibility. We propose an algorithm that considers the mapping scheme between the location of items and items that guarantees the linear time complexity. Experimental results show that the proposed method can reduce many execution time and memory usage based on the data set obtained from the FIMI repository website.

An Efficient Candidate Pattern Tree Structure and Algorithm for Incremental Web Mining (점진적인 웹 마이닝을 위한 효율적인 후보패턴 저장 트리구조 및 알고리즘)

  • Kang, Hee-Seong;Park, Byung-Joon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.44 no.1
    • /
    • pp.71-79
    • /
    • 2007
  • Recent advances in the internet infrastructure have resulted in a large number of huge Web sites and portals worldwide. These Web sites are being visited by various types of users in many different ways. Among all the web page access sequences from different users, some of them occur so frequently that may need an attention from those who are interested. We call them frequent access patterns and access sequences that can be frequent the candidate patterns. Since these candidate patterns play an important role in the incremental Web mining, it is important to efficiently generate, add, delete, and search for them. This thesis presents a novel tree structure that can efficiently store the candidate patterns and a related set of algorithms for generating the tree structure, adding new patterns, deleting unnecessary patterns, and searching for the needed ones. The proposed tree structure has a kind of the 3 dimensional link structure and its nodes are layered.

Design of Heuristic Decision Tree (HDT) Using Human Knowledge (인간 지식을 이용한 경험적 의사결정트리의 설계)

  • Yoon, Tae-Tok;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.4
    • /
    • pp.525-531
    • /
    • 2009
  • Data mining is the process of extracting hidden patterns from collected data. At this time, for collected data which take important role as the basic information for prediction and recommendation, the process to discriminate incorrect data in order to enhance the performance of analysis result, is needed. The existing methods to discriminate unexpected data from collected data, mainly relies on methods which are based on statistics or simple distance between data. However, for these methods, the problematic point that even meaningful data could be excluded from analysis due that the environment and characteristic of the relevant data are not considered, exists. This study proposes a method to endow human heuristic knowledge with weight value through the comparison between collected data and human heuristic knowledge, and to use the value for creating a decision tree. The data discrimination by the method proposed is more credible as human knowledge is reflected in the created tree. The validity of the proposed method is verified through an experiment.

Intelligent Fault Diagnosis System Using Hybrid Data Mining (하이브리드 데이터마이닝을 이용한 지능형 이상 진단 시스템)

  • Baek, Jun-Geol;Heo, Jun
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2005.05a
    • /
    • pp.960-968
    • /
    • 2005
  • The high cost in maintaining complex manufacturing process makes it necessary to enhance an efficient maintenance system. For the effective maintenance of manufacturing process, precise fault diagnosis should be performed and an appropriate maintenance action should be executed. This paper suggests an intelligent fault diagnosis system using hybrid data mining. In this system, the rules for the fault diagnosis are generated by hybrid decision tree/genetic algorithm and the most effective maintenance action is selected by decision network and AHP. To verify the proposed intelligent fault diagnosis system, we compared the accuracy of the hybrid decision tree/genetic algorithm with one of the general decision tree learning algorithm(C4.5) by data collected from a coil-spring manufacturing process.

  • PDF

Correlation Analysis of the Frequency and Death Rates in Arterial Intervention using C4.5

  • Jung, Yong Gyu;Jung, Sung-Jun;Cha, Byeong Heon
    • International journal of advanced smart convergence
    • /
    • v.6 no.3
    • /
    • pp.22-28
    • /
    • 2017
  • With the recent development of technologies to manage vast amounts of data, data mining technology has had a major impact on all industries.. Data mining is the process of discovering useful correlations hidden in data, extracting executable information for the future, and using it for decision making. In other words, it is a core process of Knowledge Discovery in data base(KDD) that transforms input data and derives useful information. It extracts information that we did not know until now from a large data base. In the decision tree, c4.5 algorithm was used. In addition, the C4.5 algorithm was used in the decision tree to analyze the difference between frequency and mortality in the region. In this paper, the frequency and mortality of percutaneous coronary intervention for patients with heart disease were divided into regions.

A study on data mining techniques for soil classification methods using cone penetration test results

  • Junghee Park;So-Hyun Cho;Jong-Sub Lee;Hyun-Ki Kim
    • Geomechanics and Engineering
    • /
    • v.35 no.1
    • /
    • pp.67-80
    • /
    • 2023
  • Due to the nature of the conjunctive Cone Penetration Test(CPT), which does not verify the actual sample directly, geotechnical engineers commonly classify the underground geomaterials using CPT results with the classification diagrams proposed by various researchers. However, such classification diagrams may fail to reflect local geotechnical characteristics, potentially resulting in misclassification that does not align with the actual stratification in regions with strong local features. To address this, this paper presents an objective method for more accurate local CPT soil classification criteria, which utilizes C4.5 decision tree models trained with the CPT results from the clay-dominant southern coast of Korea and the sand-dominant region in South Carolina, USA. The results and analyses demonstrate that the C4.5 algorithm, in conjunction with oversampling, outlier removal, and pruning methods, can enhance and optimize the decision tree-based CPT soil classification model.

Modeling of Environmental Survey by Decision Trees

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2004.10a
    • /
    • pp.63-75
    • /
    • 2004
  • The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. We analyze Gyeongnam social indicator survey data using decision tree techniques for environmental information. We can use these decision tree outputs for environmental preservation and improvement.

  • PDF