• Title/Summary/Keyword: Tree mining

Search Result 566, Processing Time 0.025 seconds

Acacia Dominated Area Exclosures Enhance the Carbon Sequestration Potential of Degraded Dryland Forest Ecosystems

  • Halefom, Zenebu;Kebede, Fassil;Fitwi, Ibrahim;Abraha, Zenebe;Gebresamuel, Girmay;Birhane, Emiru
    • Journal of Forest and Environmental Science
    • /
    • v.36 no.1
    • /
    • pp.25-36
    • /
    • 2020
  • Area exclosure is a widely practiced intervention of restoring degraded lands though its impact in sequestering terrestrial and soil carbon is scanty. The study was initiated to investigate the effect of exclosure of different ages on carbon sequestration potential of restoring degraded dryland ecosystems in eastern Tigray, northern Ethiopia. Twelve plots each divided into three layers were randomly selected from 5, 10 and 15 years old exclosures and paired adjacent open grazing land. Tree and shrub biomasses were determined using destructive sampling while herb layer biomass was determined using total harvest. The average total biomass obtained were 13.6, 24.8, 27.1, and 55.5 Mg ha-1 for open grazing, 5 years, 10 years, and 15 years exclosures respectively. The carbon content of plant species ranged between 48 to 53 percent of a dry biomass. The total carbon stored in the 5 years, 10 years and 15 years age exclosures were 39 Mg C ha-1, 46.3 Mg C ha-1, and 64.6 Mg C ha-1 respectively while in the open grazing land the value was 24.7 Mg C ha-1. Carbon stock is age dependent and increases with age. The difference in total carbon content between exclosures and open grazing land varied between 14.3-40 Mg C ha-1. Although it is difficult to extrapolate this result for a longer future, the average annual carbon being sequestered in the oldest exclosure was about 2.7 Mg C ha-1 yr-1. In view of improving degraded area and sequestering carbon, area exclosures are promising options.

Privacy Policy Analysis Techniques Using Deep Learning (딥러닝을 활용한 개인정보 처리방침 분석 기법 연구)

  • Jo, Yong-Hyun;Cha, Young-Kyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.2
    • /
    • pp.305-312
    • /
    • 2020
  • The Privacy Act stipulates that the privacy policy document, which is a privacy statement, should be disclosed in order to guarantee the rights of the information subjects, and the Fair Trade Commission considers the privacy policy as a condition and conducts an unfair review of the terms and conditions under the Terms and Conditions Control Act. However, the information subjects tend not to read personal information because it is complicated and difficult to understand. Simple and legible information processing policies will increase the probability of participating in online transactions, contributing to the increase in corporate sales and resolving the problem of information asymmetry between operators and information entities. In this study, complex personal information processing policies are analyzed using deep learning, and models are presented for acquiring simplified personal information processing policies that are highly readable by the information subjects. To present the model, the personal information processing policies of 258 domestic companies were established as data sets and analyzed using deep learning technology.

Study on Classification Algorithm based on Weight of Support and Confidence Degree (지지도와 신뢰도의 가중치에 기반한 분류알고리즘에 관한 연구)

  • Kim, Keun-Hyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.4
    • /
    • pp.700-713
    • /
    • 2009
  • Most of any existing classification algorithm in data mining area have focused on goals improving efficiency, which is to generate decision tree more rapidly by utilizing just less computing resources. In this paper, we focused on the efficiency as well as effectiveness that is able to generate more meaningful classification rules in application area, which might consist of the ontology automatic generation, business environment and so on. For this, we proposed not only novel function with the weight of support and confidence degree but also analyzed the characteristics of the weighted function in theoretical viewpoint. Furthermore, we proposed novel classification algorithm based on the weighted function and the characteristics. In the result of evaluating the proposed algorithm, we could perceive that the novel algorithm generates more classification rules with significance more rapidly.

Proposition of balanced comparative confidence considering all available diagnostic tools (모든 가능한 진단도구를 활용한 균형비교신뢰도의 제안)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.3
    • /
    • pp.611-618
    • /
    • 2015
  • By Wikipedia, big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Data mining is the computational process of discovering patterns in huge data sets involving methods at the intersection of association rule, decision tree, clustering, artificial intelligence, machine learning. Association rule is a well researched method for discovering interesting relationships between itemsets in huge databases and has been applied in various fields. There are positive, negative, and inverse association rules according to the direction of association. If you want to set the evaluation criteria of association rule, it may be desirable to consider three types of association rules at the same time. To this end, we proposed a balanced comparative confidence considering sensitivity, specificity, false positive, and false negative, checked the conditions for association threshold by Piatetsky-Shapiro, and compared it with comparative confidence and inversely comparative confidence through a few experiments.

A Sparse Data Preprocessing Using Support Vector Regression (Support Vector Regression을 이용한 희소 데이터의 전처리)

  • Jun, Sung-Hae;Park, Jung-Eun;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.6
    • /
    • pp.789-792
    • /
    • 2004
  • In various fields as web mining, bioinformatics, statistical data analysis, and so forth, very diversely missing values are found. These values make training data to be sparse. Largely, the missing values are replaced by predicted values using mean and mode. We can used the advanced missing value imputation methods as conditional mean, tree method, and Markov Chain Monte Carlo algorithm. But general imputation models have the property that their predictive accuracy is decreased according to increase the ratio of missing in training data. Moreover the number of available imputations is limited by increasing missing ratio. To settle this problem, we proposed statistical learning theory to preprocess for missing values. Our statistical learning theory is the support vector regression by Vapnik. The proposed method can be applied to sparsely training data. We verified the performance of our model using the data sets from UCI machine learning repository.

A study on the analysis of customer loan for the credit finance company using classification model (분류모형을 이용한 여신회사 고객대출 분석에 관한 연구)

  • Kim, Tae-Hyung;Kim, Yeong-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.411-425
    • /
    • 2013
  • The importance and necessity of the credit loan are increasing over time. Also, it is a natural consequence that the increase of the risk for borrower increases the risk of non-performing loan. Thus, we need to predict accurately in order to prevent the loss of a credit loan company. Our final goal is to build reliable and accurate prediction model, so we proceed the following steps: At first, we can get an appropriate sample by using several resampling methods. Second, we can consider variety models and tools to fit our resampling data. Finally, in order to find the best model for our real data, various models were compared and assessed.

A CRM Study on the Using of Data Mining - Focusing on the "A" Fashion Company - (데이타마이닝을 이용(利用)한 CRM 사례연구(事例硏究) - A 패션기업(企業)을 중심(中心)으로 -)

  • Lee, Yu-Soon
    • Journal of Fashion Business
    • /
    • v.6 no.5
    • /
    • pp.136-150
    • /
    • 2002
  • In this study, we proposed a method to be standing customers as the supporting system for the improvement of fashion garment industry which was the marginal growth getting into full maturity of market. As for the customer creation method of Fashion garment company is developing a marketing program to be standing customer as customer scoring to estimate a existing customer‘s buying power, and figure out minimum fixed sales of company to use a future purchasing predict. This study was a result of data from total sixty thousands data to be created for the 11 months from september. 2000 to July. 2001. The data is part of which the company leading the Korean fashion garment industry has a lot of a customer purchasing history data. But this study used only 48,845 refined purchased data to discriminate from sixty thousands data and 21,496 customer case with the exception of overlapping purchased data among of those. The software used to handle sixty thousands data was SAS e-miner. As the analysis process is put in to operation the analysis of the purchasing customer’s profile firstly, and the second come into basket analysis to consider the buying associations for Association goods, the third estimate the customer grade of Customer loyalty by 3 ways of logit regression analysis, decision tree, Artificial Neural Network. The result suggested a method to be estimate the customer loyalty as 3 independent variables, 2 coefficients. The 3 independent variables are total purchasing amount, purchasing items per one purchase, payment amount by one purchasing item. The 2 coefficients are royal and normal for customer segmentation. The result was that this model use a logit regression analysis was valid as the method to be estimate the customer loyalty.

Multiple SVM Classifier for Pattern Classification in Data Mining (데이터 마이닝에서 패턴 분류를 위한 다중 SVM 분류기)

  • Kim Man-Sun;Lee Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.3
    • /
    • pp.289-293
    • /
    • 2005
  • Pattern classification extracts various types of pattern information expressing objects in the real world and decides their class. The top priority of pattern classification technologies is to improve the performance of classification and, for this, many researches have tried various approaches for the last 40 years. Classification methods used in pattern classification include base classifier based on the probabilistic inference of patterns, decision tree, method based on distance function, neural network and clustering but they are not efficient in analyzing a large amount of multi-dimensional data. Thus, there are active researches on multiple classifier systems, which improve the performance of classification by combining problems using a number of mutually compensatory classifiers. The present study identifies problems in previous researches on multiple SVM classifiers, and proposes BORSE, a model that, based on 1:M policy in order to expand SVM to a multiple class classifier, regards each SVM output as a signal with non-linear pattern, trains the neural network for the pattern and combine the final results of classification performance.

Combined Application of Data Imbalance Reduction Techniques Using Genetic Algorithm (유전자 알고리즘을 활용한 데이터 불균형 해소 기법의 조합적 활용)

  • Jang, Young-Sik;Kim, Jong-Woo;Hur, Joon
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.3
    • /
    • pp.133-154
    • /
    • 2008
  • The data imbalance problem which can be uncounted in data mining classification problems typically means that there are more or less instances in a class than those in other classes. In order to solve the data imbalance problem, there has been proposed a number of techniques based on re-sampling with replacement, adjusting decision thresholds, and adjusting the cost of the different classes. In this paper, we study the feasibility of the combination usage of the techniques previously proposed to deal with the data imbalance problem, and suggest a combination method using genetic algorithm to find the optimal combination ratio of the techniques. To improve the prediction accuracy of a minority class, we determine the combination ratio based on the F-value of the minority class as the fitness function of genetic algorithm. To compare the performance with those of single techniques and the matrix-style combination of random percentage, we performed experiments using four public datasets which has been generally used to compare the performance of methods for the data imbalance problem. From the results of experiments, we can find the usefulness of the proposed method.

  • PDF

Study on Development of Classification Model and Implementation for Diagnosis System of Sasang Constitution (사상체질 분류모형 개발 및 진단시스템의 구현에 관한 연구)

  • Beum, Soo-Gyun;Jeon, Mi-Ran;Oh, Am-Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.08a
    • /
    • pp.155-159
    • /
    • 2008
  • In this thesis, in order to develop a new classification model of Sasang Constitutional medical types, which is helpful for improving the accuracy of diagnosis of medical types. various data-mining classification models such as discriminant analysis. decision trees analysis, neural networks analysis, logistics regression analysis, clustering analysis which are main classification methods were applied to the questionnaires of medical type classification. In this manner, a model which scientifically classifies constitutional medical types in the field of Sasang Constitutional Medicine, one of a traditional Korean medicine, has been developed. Also, the above-mentioned analysis models were systematically compared and analyzed. In this study, a classification of Sasang constitutional medical types was developed based on the discriminate analysis model and decision trees analysis model of which accuracy is relatively high, of which analysis procedure is easy to understand and to explain and which are easy to implement. Also, a diagnosis system of Sasang constitution was implemented applying the two analysis models.

  • PDF