• 제목/요약/키워드: regression tree

검색결과 689건 처리시간 0.025초

An application to Zero-Inflated Poisson Regression Model

  • Kim, Kyung-Moo
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권1호
    • /
    • pp.45-53
    • /
    • 2003
  • The Zero-Inflated Poisson regression is a model for count data with exess zeros. When the reponse variables have excess zeros, it is not easy to apply the Poisson regression model. In this paper, we study and simulate the zero-inflated Poisson regression model. An real example was applied to this model. Regression parameters are estimated by using MLE's. We also compare the fitness of zero-inflated Poisson model with the Poisson regression and decision tree model.

  • PDF

YOLOv5 및 다항 회귀 모델을 활용한 사과나무의 착과량 예측 방법 (Estimation of fruit number of apple tree based on YOLOv5 and regression model)

  • 곽희진;정윤주;전익조;이철희
    • 전기전자학회논문지
    • /
    • 제28권2호
    • /
    • pp.150-157
    • /
    • 2024
  • 본 논문은 딥러닝 기반 객체 탐지 모델과 다항 회귀모델을 이용하여 사과나무에 열린 사과의 개수를 예측할 수 있는 새로운 알고리즘을 제안한다. 사과나무에 열린 사과의 개수를 측정하면 사과 생산량을 예측할 수 있고, 농산물 재해 보험금 산정을 위한 손실을 평가하는 데에도 활용할 수 있다. 사과 착과량 측정을 위해 사과나무의 앞면과 뒷면을 촬영하였다. 촬영된 사진에서 사과를 식별하여 라벨링한 데이터 세트를 구축하였고, 이 데이터 세트를 활용하여 1단계 객체 탐지 방식의 CNN 모델을 학습시켰다. 그런데 사과나무에서 사과가 나뭇잎, 가지 등으로 가려진 경우 영상에 포착되지 않아 영상 인식 기반의 딥러닝 모델이 해당 사과를 인식하거나 추론하는 것이 어렵다. 이 문제를 해결하기 위해, 우리는 두 단계로 이루어진 추론 과정을 제안한다. 첫 번째 단계에서는 영상 기반 딥러닝 모델을 사용하여 사과나무의 양쪽에서 촬영한 사진에서 각각의 사과 개수를 측정한다. 두 번째 단계에서는 딥러닝 모델로 측정한 사과 개수의 합을 독립변수로, 사람이 실제로 과수원을 방문하여 카운트한 사과 개수를 종속변수로 설정하여 다항 회귀 분석을 수행한다. 본 논문에서 제안하는 2단계 추론 시스템의 성능 평가 결과, 각 사과나무에서 사과 개수를 측정하는 평균 정확도가 90.98%로 나타났다. 따라서 제안된 방법은 수작업으로 사과의 개수를 측정하는 데 드는 시간과 비용을 크게 절감할 수 있다. 또한, 이 방법은 딥러닝 기반 착과량 예측의 새로운 기반 기술로 관련 분야에서 널리 활용될 수 있을 것이다.

공동주택단지내 조경수목의 생장과 피음시간과의 관계 (Relation between the Shade Hours and the Landscape Tree Growth in the Apartment Housing Areas)

  • 윤근영;안건용
    • 한국환경생태학회지
    • /
    • 제10권1호
    • /
    • pp.49-57
    • /
    • 1996
  • 공동주택단지내 조경수목의 생장과 피음시간과의 관계를 파악하기 위하여 과천주공아파트 2단지 내의 조경수목 4종을 대상으로 현재의 규격과 식재위치를 조사하여 개체별 그림자시간을 분석하고 피음시간과 수목규격을 단순회귀분석하였다. 전체적으로 R$^{2}$값이 낮아 두 변수간의 상관성은 설명력이 약하다 하겠으며, 도출된 회귀방정식도 일반화 할 수는 없는 것으로 판단되었다. 즉, 본 연구대상지내에서 조경수목의 생장과 피음시간과의 상관관계가 낮아 대상지내 조겨수목의 생장에 있어서 피음시간은 타 환경여인보다 상대적 중요도가 낮은 것으로 추정되었다. 다만, 전반적으로 스츠로브잣나무는 부(-)의 상관관계를, 단풍나무와 백목련은 정(+)의 상관관계를 보여 음양수의 특성이 나타난 것으로 추정되었다. 또한, 수종별로 통계적으로 유의성이 있는 경우는, 단풍나무의 경우 근원직경 및 수관폭, 백목련의 경우 수관폭과 피음시간과의 관계로서 상관계수 0.4 미만의 낮은 상관성을 보였다.

  • PDF

머신러닝을 활용한 모돈의 생산성 예측모델 (Forecasting Sow's Productivity using the Machine Learning Models)

  • 이민수;최영찬
    • 농촌지도와개발
    • /
    • 제16권4호
    • /
    • pp.939-965
    • /
    • 2009
  • The Machine Learning has been identified as a promising approach to knowledge-based system development. This study aims to examine the ability of machine learning techniques for farmer's decision making and to develop the reference model for using pig farm data. We compared five machine learning techniques: logistic regression, decision tree, artificial neural network, k-nearest neighbor, and ensemble. All models are well performed to predict the sow's productivity in all parity, showing over 87.6% predictability. The model predictability of total litter size are highest at 91.3% in third parity and decreasing as parity increases. The ensemble is well performed to predict the sow's productivity. The neural network and logistic regression is excellent classifier for all parity. The decision tree and the k-nearest neighbor was not good classifier for all parity. Performance of models varies over models used, showing up to 104% difference in lift values. Artificial Neural network and ensemble models have resulted in highest lift values implying best performance among models.

  • PDF

"Pool-the-Maximum-Violators" Algorithm

  • Kikuo Yanagi;Akio Kudo;Park, Yong-Beom
    • Journal of the Korean Statistical Society
    • /
    • 제21권2호
    • /
    • pp.201-207
    • /
    • 1992
  • The algorithm for obtaining the isotonic regression in simple tree order, the most basic and simplest model next to the simple order, is considered. We propose to call it "Pool-the-Maximum-Violators" algorithm (PMVA) in conjunction with the "Pool-Adjacent-Violators" algorithm (PAVA) in the simple order. The dual problem of obtaining the isotonic regression in simple tree order is our main concern. An intuitively appealing relation between the primal and the dual problems is demonstrated. The interesting difference is that in simple order the required number of pooling is at least the number of initial violating pairs and any path leads to the solution, whereas in the simple tree order it is at most the number of initial violators and there is only one advisable path although there may be some others leading to the same solution.o the same solution.

  • PDF

IMPERVIOUS SURFACE ESTIMATION USING REMOTE SENSING IMAGES AND TREE REGRESSIOIN

  • Kim, Soo-Young;Kim, Jong-Hong;Heo, Joon;Heo, Jun-Haeng
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume I
    • /
    • pp.239-242
    • /
    • 2006
  • Impervious surface is an important index for the estimation of urbanization and environmental change. In addition, impervious surface has an influence on the parameters of rainfall-runoff model during rainy season. The increase of impervious surface causes peak discharge increasing and fast concentration time in urban area. Accordingly, impervious surface estimation is an important factor of urban rainfall-runoff model development and calibration. In this study, impervious surface estimation is performed by using remote sensing images such as landsat-7 ETM+ and high resolution satellite image and regression tree algorithm based on case study area ? Jungnang-cheon basin in Korea.

  • PDF

우리나라 주요수종의 Allometry와 개체목 흉고단면적 생장모델 개발 (Development of Allometry and Individual Basal Area Growth Model for Major Species in Korea)

  • 최정기
    • Journal of Forest and Environmental Science
    • /
    • 제27권1호
    • /
    • pp.47-54
    • /
    • 2011
  • Allometry and basal area equations were developed with various tree measurement variables for the major species; Quercus variabilis, Quercus mongolica, Pinus koraiensis and Larix leptolepis in Korea. For allometry models, the relationships between total height-DBH, crown width-DBH, height to the widest portion of the crown-total height, and height to base of crown-total height were investigated. Multiple regression methods were used to relate annual basal area growth to tree variables of initial size (DBH, total height, and crown width), relative size (relative diameter and relative height) as well as competition measures (competition index, crown class, and live crown ratio).

Tree-Structure-Aware Genetic Operators in Genetic Programming

  • Seo, Kisung;Pang, Chulhyuk
    • Journal of Electrical Engineering and Technology
    • /
    • 제9권2호
    • /
    • pp.749-754
    • /
    • 2014
  • In this paper, we suggest tree-structure-aware GP (Genetic Programming) operators that heed tree distributions in structure space and their possible structural difficulties. The main idea of the proposed GP operators is to place the generated offspring of crossover and/or mutation in a specified region of tree structure space insofar as possible by biasing the tree structures of the altered subtrees, taking into account the observation that most solutions are found in that region. To demonstrate the effectiveness of the proposed approach, experiments on the binomial-3 regression, multiplexor and even parity problems are performed. The results show that the results using the proposed tree-structure-aware operators are superior to the results of standard GP for all three test problems in both success rate and number of evaluations.

범주형 자료에 대한 데이터 마이닝 분류기법 성능 비교 (Comparison of Data Mining Classification Algorithms for Categorical Feature Variables)

  • 손소영;신형원
    • 산업공학
    • /
    • 제12권4호
    • /
    • pp.551-556
    • /
    • 1999
  • In this paper, we compare the performance of three data mining classification algorithms(neural network, decision tree, logistic regression) in consideration of various characteristics of categorical input and output data. $2^{4-1}$. 3 fractional factorial design is used to simulate the comparison situation where factors used are (1) the categorical ratio of input variables, (2) the complexity of functional relationship between the output and input variables, (3) the size of randomness in the relationship, (4) the categorical ratio of an output variable, and (5) the classification algorithm. Experimental study results indicate the following: decision tree performs better than the others when the relationship between output and input variables is simple while logistic regression is better when the other way is around; and neural network appears a better choice than the others when the randomness in the relationship is relatively large. We also use Taguchi design to improve the practicality of our study results by letting the relationship between the output and input variables as a noise factor. As a result, the classification accuracy of neural network and decision tree turns out to be higher than that of logistic regression, when the categorical proportion of the output variable is even.

  • PDF

CORRELATION ANALYSIS BETWEEN FOREST VOLUME, ETM+ BANDS, AND HEIGHT ESTIMATED FROM C-BAND SRTM PRODUCT

  • Kim, Jin-Woo;Kim, Jong-Hong;Lee, Jung-Bin;Heo, Joon
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume I
    • /
    • pp.512-515
    • /
    • 2006
  • Forest stand height and volume are important indicators for management purpose as well as for the environmental analysis. Shuttle Radar Topography Mission (SRTM) is backscattered over forest canopy and DSM can be acquired from such scattering characteristic, while National Elevation Dataset (NED) provides bare earth elevation data. The difference between SRTM and NED is estimated as tree height, and it is correlated with forest parameters, it is correlated with forest parameters, including average DBH, Trees per acre, net BF per acre, and total Net MBF. Especially, among them, net Board Foot(BF) per acre is the index that well represents forest volume. The Project site was Douglas-fir dominating plantation area in the western Washington an the northern Oregon in the U.S. This study shows a relationship of high correlation between the forest parameters and the product from SRTM, NED, and ETM+. This research performs multi regression analysis and regression tree algorithm, and can get more improved relationship between several parameters.

  • PDF