• Title/Summary/Keyword: Tree mining

Search Result 566, Processing Time 0.028 seconds

Learning to Prevent Inactive Student of Indonesia Open University

  • Tama, Bayu Adhi
    • Journal of Information Processing Systems
    • /
    • v.11 no.2
    • /
    • pp.165-172
    • /
    • 2015
  • The inactive student rate is becoming a major problem in most open universities worldwide. In Indonesia, roughly 36% of students were found to be inactive, in 2005. Data mining had been successfully employed to solve problems in many domains, such as for educational purposes. We are proposing a method for preventing inactive students by mining knowledge from student record systems with several state of the art ensemble methods, such as Bagging, AdaBoost, Random Subspace, Random Forest, and Rotation Forest. The most influential attributes, as well as demographic attributes (marital status and employment), were successfully obtained which were affecting student of being inactive. The complexity and accuracy of classification techniques were also compared and the experimental results show that Rotation Forest, with decision tree as the base-classifier, denotes the best performance compared to other classifiers.

Exploration of Association Rules for Social Survey Data

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.04a
    • /
    • pp.18-24
    • /
    • 2005
  • The methods of data mining are decision tree, association rules, clustering, neural network and so on. Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. We analyze Gyeongnam social indicator survey data by 2003 using association rule technique for environment information. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial and retail sectors. We can use association rule outputs in environmental preservation and environmental improvement.

  • PDF

Fault Diagnosis of Equipment of Wastewater Treatment Plants by Vibration Signal Analysis Using Time-Series Data Mining

  • Choi, Dae-Won;Bae, Hyeon;Chun, Seung-Pyo;Kim, Sung-Shin
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.2192-2197
    • /
    • 2005
  • This paper describes how to diagnose SBR plant equipment using time-series data mining. It shows the equipment diagnostics based upon vibration signals that are acquired from each device for process control. Data transform techniques including two data preprocessing skills and data mining methods were employed in the data analysis. The proposed method is not only suitable for SBR equipment, but is also suitable for other industrial devices. The experimental results performed on a lab-scale SBR plant show a good equipment-management performance.

  • PDF

Characteristics on Inconsistency Pattern Modeling as Hybrid Data Mining Techniques (혼합 데이터 마이닝 기법인 불일치 패턴 모델의 특성 연구)

  • Hur, Joon;Kim, Jong-Woo
    • Journal of Information Technology Applications and Management
    • /
    • v.15 no.1
    • /
    • pp.225-242
    • /
    • 2008
  • PM (Inconsistency Pattern Modeling) is a hybrid supervised learning technique using the inconsistence pattern of input variables in mining data sets. The IPM tries to improve prediction accuracy by combining more than two different supervised learning methods. The previous related studies have shown that the IPM was superior to the single usage of an existing supervised learning methods such as neural networks, decision tree induction, logistic regression and so on, and it was also superior to the existing combined model methods such as Bagging, Boosting, and Stacking. The objectives of this paper is explore the characteristics of the IPM. To understand characteristics of the IPM, three experiments were performed. In these experiments, there are high performance improvements when the prediction inconsistency ratio between two different supervised learning techniques is high and the distance among supervised learning methods on MDS (Multi-Dimensional Scaling) map is long.

  • PDF

Twostep Clustering of Environmental Indicator Survey Data

  • Park, Hee-Chang
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.10a
    • /
    • pp.59-69
    • /
    • 2005
  • Data mining technique is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. Clustering is the process of grouping the data into clusters so that objects within a cluster have high similarity in comparison to one another. It has been widely used in many applications, such that pattern analysis or recognition, data analysis, image processing, market research on off-line or on-line and so on. We analyze Gyeongnam social indicator survey data by 2001 using twostep clustering technique for environment information. The twostep clustering is classified as a partitional clustering method. We can apply these twostep clustering outputs to environmental preservation and improvement.

  • PDF

A Neuro-Fuzzy Model Approach for the Land Cover Classification

  • Han, Jong-Gyu;Chi, Kwang-Hoon;Suh, Jae-Young
    • Proceedings of the KSRS Conference
    • /
    • 1998.09a
    • /
    • pp.122-127
    • /
    • 1998
  • This paper presents the neuro-fuzzy classifier derived from the generic model of a 3-layer fuzzy perceptron and developed the classification software based on the neuro-fuzzl model. Also, a comparison of the neuro-fuzzy and maximum-likelihood classifiers is presented in this paper. The Airborne Multispectral Scanner(AMS) imagery of Tae-Duk Science Complex Town were used for this comparison. The neuro-fuzzy classifier was more considerably accurate in the mixed composition area like "bare soil" , "dried grass" and "coniferous tree", however, the "cement road" and "asphalt road" classified more correctly with the maximum-likelihood classifier than the neuro-fuzzy classifier. Thus, the neuro-fuzzy model can be used to classify the mixed composition area like the natural environment of korea peninsula. From this research we conclude that the neuro-fuzzy classifier was superior in suppression of mixed pixel classification errors, and more robust to training site heterogeneity and the use of class labels for land use that are mixtures of land cover signatures.

  • PDF

Twostep Clustering of Environmental Indicator Survey Data

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.1-11
    • /
    • 2006
  • Data mining technique is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. Clustering is the process of grouping the data into clusters so that objects within a cluster have high similarity in comparison to one another. It has been widely used in many applications, such that pattern analysis or recognition, data analysis, image processing, market research on off-line or on-line and so on. We analyze Gyeongnam social indicator survey data by 2001 using twostep clustering technique for environment information. The twostep clustering is classified as a partitional clustering method. We can apply these twostep clustering outputs to environmental preservation and improvement.

  • PDF

Data mining approach to predicting user's past location

  • Lee, Eun Min;Lee, Kun Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.11
    • /
    • pp.97-104
    • /
    • 2017
  • Location prediction has been successfully utilized to provide high quality of location-based services to customers in many applications. In its usual form, the conventional type of location prediction is to predict future locations based on user's past movement history. However, as location prediction needs are expanded into much complicated cases, it becomes necessary quite frequently to make inference on the locations that target user visited in the past. Typical cases include the identification of locations that infectious disease carriers may have visited before, and crime suspects may have dropped by on a certain day at a specific time-band. Therefore, primary goal of this study is to predict locations that users visited in the past. Information used for this purpose include user's demographic information and movement histories. Data mining classifiers such as Bayesian network, neural network, support vector machine, decision tree were adopted to analyze 6868 contextual dataset and compare classifiers' performance. Results show that general Bayesian network is the most robust classifier.

A Forecast Model on High School Students' Suicidal Ideation: The Investigation Risk Factors and Protective Factors Using Data Mining (고등학생의 자살사고 예측모형 : 데이터마이닝을 적용한 위험요인과 보호요인의 탐색)

  • 이주리
    • Journal of the Korean Home Economics Association
    • /
    • v.47 no.5
    • /
    • pp.67-77
    • /
    • 2009
  • This study examined risk factors and protective factors in high school students’ suicidal ideation. Participants were 2000 adolescents from the KEEP(Korean Education and Employment Panel). Data mining decision tree model revealed that: (1) Irrespective of sex, the most important predictor was father-adolescent relationship. (2) Positive mother-adolescent relationship was predicted as protective factor in condition of negative father-adolescent relationship. (3) Family activities was predicted as risk factor in condition of negative mother-adolescent relationship under the circumstances with negative father-adolescent relationship. (4) Low self-evaluation was predicted as risk factor in condition of serious agony about personality under the circumstances with positive father-adolescent relationship.

A Data-Mining Model to Support new Customer Acquisition for Internet Telephony(VoIP) (인터넷전화(VoIP)의 신규고객 유치를 지원하는 데이터마이닝 모델)

  • Ha, Sung-Ho;Yang, Jeong-Won;Song, Young-Mi
    • Journal of Information Technology Applications and Management
    • /
    • v.17 no.2
    • /
    • pp.133-154
    • /
    • 2010
  • Recently, Internet Telephony has become increasingly popular in telecommunication industry. However, previous research on Internet Telephony has focused on analyzing specific Internet Telephonysolutions, identifyingthe Internet Telephony movement itself. The research on prediction models about Internet Telephony adoption has been minimal. The main propose of this study is to develop models for predicting transition intention from using traditional telephones to using Internet Telephony. To do so, this study uses data mining methods to analyze demands in the IT communications market and to provide management strategies for Internet telephony providers. Especially this study uses discriminant analysis, logistic regression, classification tree, and neural nets to develop those prediction models toward Internet Telephony adoption. The models are compared with each other and a superior model is chosen.

  • PDF