• Title/Summary/Keyword: Tree mining

Search Result 566, Processing Time 0.029 seconds

Efficient Algorithms for Mining Association Rules Under the Interactive Environments (대화형 환경에서 효율적인 연관 규칙 알고리즘)

  • Lee, Jae-Moon
    • The KIPS Transactions:PartD
    • /
    • v.8D no.4
    • /
    • pp.339-346
    • /
    • 2001
  • A problem for mining association rules under the interactive environments is to mine repeatedly association rules with the different minimum support. This problem includes all subproblems except on the facts that mine repeatedly association rules with the s믇 database. This paper proposed the efficient algorithms to improve the performance by using the information of the candidate large itemsets which calculate the previous association rules. The proposed algorithms were compared with the conventional algorithm with respect to the execution time. The comparisons show that the proposed algorithms achieve 10∼30% more gain than the conventional algorithm.

  • PDF

a Study on Using Social Big Data for Expanding Analytical Knowledge - Domestic Big Data supply-demand expectation - (분석지의 확장을 위한 소셜 빅데이터 활용연구 - 국내 '빅데이터' 수요공급 예측 -)

  • Kim, Jung-Sun;Kwon, Eun-Ju;Song, Tae-Min
    • Knowledge Management Research
    • /
    • v.15 no.3
    • /
    • pp.169-188
    • /
    • 2014
  • Big data seems to change knowledge management system and method of enterprises to large extent. Further, the type of method for utilization of unstructured data including image, v ideo, sensor data a nd text may determine the decision on expansion of knowledge management of the enterprise or government. This paper, in this light, attempts to figure out the prediction model of demands and supply for big data market of Korea trough data mining decision making tree by utilizing text bit data generated for 3 years on web and SNS for expansion of form for knowledge management. The results indicate that the market focused on H/W and storage leading by the government is big data market of Korea. Further, the demanders of big data have been found to put important on attribute factors including interest, quickness and economics. Meanwhile, innovation and growth have been found to be the attribute factors onto which the supplier puts importance. The results of this research show that the factors affect acceptance of big data technology differ for supplier and demander. This article may provide basic method for study on expansion of analysis form of enterprise and connection with its management activities.

  • PDF

A Study on the Database Marketing using Data Mining in the Traditional Medicine (데이터마이닝을 활용한 한방분야에서의 데이터베이스 마케팅에 대한 연구)

  • Lee Sang-Young;Lee Yun-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.271-280
    • /
    • 2005
  • This study is to elicit the factors affected on the medical examination in the tra야tional medicine using the technical method of the decision tree and characterize the Patient subject by clustering analysis technique. And to draw results from the association analysis between the form of diseases in the re-hospitalized Patient group. The obtained results were analyzed for their effect on the hospital Profits. Thus. through application of the database marketing to the data mining technique in the tradition리 medicine, the characteristics of patient clients for the objective induction of factors affected on the hospital Fronts can be identified. Practical application of the database marketing as presented in this study will bring about a fundamental efficiency of hospital management and vitalization.

  • PDF

Developing the administrative model using the data mining technique for injury in National Health Insurance (데이터마이닝 기법을 활용한 국민건강보험 상해상병 관리모형 개발)

  • Park, Il-Su;Han, Jun-Tae;Sohn, Hae-Sook;Kang, Suk-Bok
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.3
    • /
    • pp.467-476
    • /
    • 2011
  • We developed the hybrid model coupled with predictive model and business rule model for administration of injury by utilizing medical data of the National Health Insurance in Korea. We performed decision tree analysis using data mining methodology and used SAS Enterprise Miner 4.1. We also investigated under several business rule for benefits (expense paid by insurer) and claims of injury in National Health Insurance Corporation. We can see that the proposed hybrid model provides a quite efficient plausible results.

Fault Pattern Analysis and Restoration Prediction Model Construction of Pole Transformer Using Data Mining Technique (데이터마이닝 기법을 이용한 주상변압기 고장유형 분석 및 복구 예측모델 구축에 관한 연구)

  • Hwang, Woo-Hyun;Kim, Ja-Hee;Jang, Wan-Sung;Hong, Jung-Sik;Han, Deuk-Su
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.57 no.9
    • /
    • pp.1507-1515
    • /
    • 2008
  • It is essential for electric power companies to have a quick restoration system of the faulted pole transformers which occupy most of transformers to supply stable electricity. However, it takes too much time to restore it when a transformer is out of order suddenly because we now count on operator in investigating causes of failure and making decision of recovery methods. This paper presents the concept of 'Fault pattern analysis and Restoration prediction model using Data mining techniques’, which is based on accumulated fault record of pole transformers in the past. For this, it also suggests external and internal causes of fault which influence the fault pattern of pole transformers. It is expected that we can reduce not only defects in manufacturing procedure by upgrading quality but also the time of predicting fault patterns and recovering when faults occur by using the result.

An Efficient Algorithm For Mining Association Rules In Main Memory Systems (대용량 주기억장치 시스템에서 효율적인 연관 규칙 탐사 알고리즘)

  • Lee, Jae-Mun
    • The KIPS Transactions:PartD
    • /
    • v.9D no.4
    • /
    • pp.579-586
    • /
    • 2002
  • This paper propose an efficient algorithm for mining association rules in the large main memory systems. To do this, the paper attempts firstly to extend the conventional algorithms such as DHP and Partition in order to be compatible to the large main memory systems and proposes secondly an algorithm to improve Partition algorithm by applying the techniques of the hash table and the bit map. The proposed algorithm is compared to the extended DHP within the experimental environments and the results show up to 65% performance improvement in comparison to the expanded DHP.

Analysis of domestic and foreign research trends of Tricholoma matsutake using text mining techniques

  • Choi, Ah Hyeon;Kang, Jun Won
    • Korean Journal of Agricultural Science
    • /
    • v.48 no.3
    • /
    • pp.505-514
    • /
    • 2021
  • Among non-timber forest products, Tricholoma matsutake is a high value added item. Many countries, including Korea, China, and Japan, are doing research and technology development to increase artificial cultivation and productivity. However, the production of T. matsutake is on the decline due to global warming, abnormal temperatures and pine tree pest problems. Therefore, it is necessary to identify trends in domestic and foreign research on T. matsutake, respond to preemptive research and development to preserve the genetic resources of T. matsutake and increase its productivity. Based on the correlation between keywords in the high frequency keywords, it was observed that microbial clusters of T. matsutake are mainly found in Korea. The main focus in China has been the pharmacology studies on the ingredients of T. matsutake. The main focus in Japan has been on preserving the genetic diversity and species of T. matsutake. Thus, future domestic studies of T. matsutake will require pharmacological studies on the ingredients of T. matsutake and on its genetic diversity and species conservation. In addition, unlike China and Japan, genetic keywords did not appear in Korea at high frequency. Therefore, Korea will have to proceed with research using modern molecular biology techniques.

A Review of Machine Learning Algorithms for Fraud Detection in Credit Card Transaction

  • Lim, Kha Shing;Lee, Lam Hong;Sim, Yee-Wai
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.9
    • /
    • pp.31-40
    • /
    • 2021
  • The increasing number of credit card fraud cases has become a considerable problem since the past decades. This phenomenon is due to the expansion of new technologies, including the increased popularity and volume of online banking transactions and e-commerce. In order to address the problem of credit card fraud detection, a rule-based approach has been widely utilized to detect and guard against fraudulent activities. However, it requires huge computational power and high complexity in defining and building the rule base for pattern matching, in order to precisely identifying the fraud patterns. In addition, it does not come with intelligence and ability in predicting or analysing transaction data in looking for new fraud patterns and strategies. As such, Data Mining and Machine Learning algorithms are proposed to overcome the shortcomings in this paper. The aim of this paper is to highlight the important techniques and methodologies that are employed in fraud detection, while at the same time focusing on the existing literature. Methods such as Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), naïve Bayesian, k-Nearest Neighbour (k-NN), Decision Tree and Frequent Pattern Mining algorithms are reviewed and evaluated for their performance in detecting fraudulent transaction.

Exploration of CHAID Algorithm by Sampling Proportion

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.215-228
    • /
    • 2003
  • Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, interaction effect identification, category merging and discretizing continuous variable, etc. CHAID(Chi-square Automatic Interaction Detector), is an exploratory method used to study the relationship between a dependent variable and a series of predictor variables. CHAID modeling selects a set of predictors and their interactions that optimally predict the dependent measure. In this paper we explore CHAID algorithm in view of accuracy and speed by sampling proportion.

  • PDF

K-means Clustering for Environmental Indicator Survey Data

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.04a
    • /
    • pp.185-192
    • /
    • 2005
  • There are many data mining techniques such as association rule, decision tree, neural network analysis, clustering, genetic algorithm, bayesian network, memory-based reasoning, etc. We analyze 2003 Gyeongnam social indicator survey data using k-means clustering technique for environmental information. Clustering is the process of grouping the data into clusters so that objects within a cluster have high similarity in comparison to one another. In this paper, we used k-means clustering of several clustering techniques. The k-means clustering is classified as a partitional clustering method. We can apply k-means clustering outputs to environmental preservation and environmental improvement.

  • PDF