• Title/Summary/Keyword: 결정나무

Search Result 787, Processing Time 0.033 seconds

A study for improving data mining methods for continuous response variables (연속형 반응변수를 위한 데이터마이닝 방법 성능 향상 연구)

  • Choi, Jin-Soo;Lee, Seok-Hyung;Cho, Hyung-Jun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.5
    • /
    • pp.917-926
    • /
    • 2010
  • It is known that bagging and boosting techniques improve the performance in classification problem. A number of researchers have proved the high performance of bagging and boosting through experiments for categorical response but not for continuous response. We study whether bagging and boosting improve data mining methods for continuous responses such as linear regression, decision tree, neural network through bagging and boosting. The analysis of eight real data sets prove the high performance of bagging and boosting empirically.

Species Diversity of Forest Vegetation in Mt.Jangan, Chollabuk-do (전라북도 장안산 삼림식생의 종다양성)

  • Kim, Chang-Hwan;Myung, Hyun;Shin, Byung-Chuel
    • Korean Journal of Environment and Ecology
    • /
    • v.13 no.3
    • /
    • pp.271-279
    • /
    • 1999
  • 전라북도 장안산의 72군락 지점에서 식물사회학적 조사에 의하여 구분된 10개 군락. 즉 신갈나무 군락, 신갈나무-철쭉꽃 군락, 신갈나무-노린재나무 군락, 신갈나무-졸참나무 군락, 졸참나무 군락, 굴참나무 군락, 서어나무 군락, 물푸레나무 군락, 층층나무 군락, 들메나무 군락에서 풍부도지수, 이질성지수, 균등도지수, 우점도지수를 산출하여 고도, 토양 특성 및 우점종군에 따른 종다양성의 변활르 분석하였으며 종서열-중요치 곡선을 이용하여 각 식물의 우점서열을 결정하고 각 종이 식물군락 내의 자원을 어떻게 분배하고 있는가를 결정하였다 고도, 토양요인(pH, base) 및 우점종의 차이는 삼림의 종 다양성에 영향을 미치는 중요한 변수로서 작용하였으며 우점종군에 따른 다양성의 변화는 지형과 교란에 의하여 영향을 받았다 종서열-중요치 곡선에서 조사된 10개 군락은 대수정규분포에 접근하고 있어서 군락간 약간의 차이는 있지만 대체적으로 어떤 특정 종이 군집 내 자원 공간을 독점하지 않고 적절히 분배하여 사용하고 있었다.

  • PDF

A study on removal of unnecessary input variables using multiple external association rule (다중외적연관성규칙을 이용한 불필요한 입력변수 제거에 관한 연구)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.877-884
    • /
    • 2011
  • The decision tree is a representative algorithm of data mining and used in many domains such as retail target marketing, fraud detection, data reduction, variable screening, category merging, etc. This method is most useful in classification problems, and to make predictions for a target group after dividing it into several small groups. When we create a model of decision tree with a large number of input variables, we suffer difficulties in exploration and analysis of the model because of complex trees. And we can often find some association exist between input variables by external variables despite of no intrinsic association. In this paper, we study on the removal method of unnecessary input variables using multiple external association rules. And then we apply the removal method to actual data for its efficiencies.

가래나무로부터 세포독성물질의 분리, 구조결정.

  • 조윤기;손종근;문동철;이인자
    • Proceedings of the Korean Society of Applied Pharmacology
    • /
    • 1994.04a
    • /
    • pp.248-248
    • /
    • 1994
  • 목적 : 가래나무는 일부 민간방과 한방에서 암 치료의 목적으로 사용되고 있다. 아울러 이 식물을 대상으로 세포독성물질의 분리, 구조결정에 관한 연구는 이미 보고되어 있는 성분인 juglone이 새포독성을 나타낸다는 단편적인 보고만 있을뿐 전혀 되어있지 않은 상태이다. 본 연구는 국내에서 자생하는 가래나무 (Juglans mandchurica)로부터 세포독성물질을 분리, 구조결정함으로서 새로운 항암제 개발에 일차적인 자료를 제공하는 것을 그 목적으로 한다. 방법 : 가래나무 뿌리의 methanol 추출물을 hexane, chloroform, ethyl acetate. n-buthanol, water로 분획하고 세포독성을 측정하였으며, 이들중 ethyl acetate 분획으로부터 세포독성을 나타내는 물질들을 분리하였으며 현재 이들의 구조를 분광분석학적인 방법으로 결정하고 있다. 결과 : 현재까지 분리된 물질들은 약한 세포독성을 나타내고있으며, 그들 중 한물질의 구조를 결정하였다.

  • PDF

Measuring Pattern Recognition from Decision Tree and Geometric Data Analysis of Industrial CR Images (산업용 CR영상의 기하학적 데이터 분석과 의사결정나무에 의한 측정 패턴인식)

  • Hwang, Jung-Won;Hwang, Jae-Ho
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.5
    • /
    • pp.56-62
    • /
    • 2008
  • This paper proposes the use of decision tree classification for the measuring pattern recognition from industrial Computed Radiography(CR) images used in nondestructive evaluation(NDE) of steel-tubes. It appears that NDE problems are naturally desired to have machine learning techniques identify patterns and their classification. The attributes of decision tree are taken from NDE test procedure. Geometric features, such as radiative angle, gradient and distance, are estimated from the analysis of input image data. These factors are used to make it easy and accurate to classify an input object to one of the pre-specified classes on decision tree. This algerian is to simplify the characterization of NDE results and to facilitate the determination of features. The experimental results verify the usefulness of proposed algorithm.

S-QUEST와 태아발육제한증 (IUGR) 조기진단시스템 개발

  • Cha, Gyeong-Jun;Park, Mun-Il;Choe, Hang-Seok;Sin, Yeong-Jae
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.05a
    • /
    • pp.171-176
    • /
    • 2003
  • 방대한 양의 데이터에서 의사결정에 필요한 정보를 발견하는 일련의 과정을 데이터 마이닝 (data mining)이라고 하는데, 본 연구에서는 생물정보학 (bioinofmatics)의 한분야로서 의학분야의 통계적 의사결정 시스템을 제공하는 의사결정나무 (decision tree) 알고리즘 중 QUEST를 S-PLUS로 구현하고(이하 S-QUEST) 발육제한(Intrauterine Growth Restriction; IUGR) 데이터를 분석하였다.

  • PDF

의사결정나무를 이용한 개인휴대통신 해지자 분석

  • 최종후;서두성
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1998.10a
    • /
    • pp.377-380
    • /
    • 1998
  • 본 논문에서는 최근 데이터마이닝의 도구로 활발하게 소개되고 있는 의사결정나무 분석을 이용하여 개인휴대통신의 해지자 분석을 실시한다. 또한 로지스틱 회귀모형을 이용하여 가입고객의 해지 가능성에 대한 점수화를 시도한다.

  • PDF

Comparative Analysis of Predictors of Depression for Residents in a Metropolitan City using Logistic Regression and Decision Making Tree (로지스틱 회귀분석과 의사결정나무 분석을 이용한 일 대도시 주민의 우울 예측요인 비교 연구)

  • Kim, Soo-Jin;Kim, Bo-Young
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.12
    • /
    • pp.829-839
    • /
    • 2013
  • This study is a descriptive research study with the purpose of predicting and comparing factors of depression affecting residents in a metropolitan city by using logistic regression analysis and decision-making tree analysis. The subjects for the study were 462 residents ($20{\leq}aged{\angle}65$) in a metropolitan city. This study collected data between October 7, 2011 and October 21, 2011 and analyzed them with frequency analysis, percentage, the mean and standard deviation, ${\chi}^2$-test, t-test, logistic regression analysis, roc curve, and a decision-making tree by using SPSS 18.0 program. The common predicting variables of depression in community residents were social dysfunction, perceived physical symptom, and family support. The specialty and sensitivity of logistic regression explained 93.8% and 42.5%. The receiver operating characteristic (roc) curve was used to determine an optimal model. The AUC (area under the curve) was .84. Roc curve was found to be statistically significant (p=<.001). The specialty and sensitivity of decision-making tree analysis were 98.3% and 20.8% respectively. As for the whole classification accuracy, the logistic regression explained 82.0% and the decision making tree analysis explained 80.5%. From the results of this study, it is believed that the sensitivity, the classification accuracy, and the logistics regression analysis as shown in a higher degree may be useful materials to establish a depression prediction model for the community residents.

The Transfer Technique among Decision Tree Models for Distributed Data Mining (분산형 데이터마이닝 구현을 위한 의사결정나무 모델 전송 기술)

  • Kim, Choong-Gon;Woo, Jung-Geun;Baik, Sung-Wook
    • Journal of Digital Contents Society
    • /
    • v.8 no.3
    • /
    • pp.309-314
    • /
    • 2007
  • A decision tree algorithm should be modified to be suitable in distributed and collaborative environments for distributed data mining. The distributed data mining system proposed in this paper consists of several agents and a mediator. Each agent deals with a local data mining for data in each local site and communicates with one another to build the global decision tree model. The mediator helps several agents to efficiently communicate among them. One of advantages in distributed data mining is to save much time to analyze huge data with several agents. The paper focuses on a transfer technique among agents dealing with each local decision tree model to reduce huge overhead in communication among them.

  • PDF

Streaming Decision Tree for Continuity Data with Changed Pattern (패턴의 변화를 가지는 연속성 데이터를 위한 스트리밍 의사결정나무)

  • Yoon, Tae-Bok;Sim, Hak-Joon;Lee, Jee-Hyong;Choi, Young-Mee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.1
    • /
    • pp.94-100
    • /
    • 2010
  • Data Mining is mainly used for pattern extracting and information discovery from collected data. However previous methods is difficult to reflect changing patterns with time. In this paper, we introduce Streaming Decision Tree(SDT) analyzing data with continuity, large scale, and changed patterns. SDT defines continuity data as blocks and extracts rules using a Decision Tree's learning method. The extracted rules are combined considering time of occurrence, frequency, and contradiction. In experiment, we applied time series data and confirmed resonable result.