• Title/Summary/Keyword: tree classification method

Search Result 361, Processing Time 0.035 seconds

Detection of Forest Areas using Airborne LIDAR Data (항공 라이다데이터를 이용한 산림영역 탐지)

  • Hwang, Se-Ran;Kim, Seong-Joon;Lee, Im-Pyeong
    • Spatial Information Research
    • /
    • v.18 no.3
    • /
    • pp.23-32
    • /
    • 2010
  • LIDAR data are useful for forest applications such as bare-earth DEM generation for forest areas, and estimation of tree height and forest biomass. As a core preprocessing procedure for most forest applications, this study attempts to develop an efficient method to detect forest areas from LIDAR data. First, we suggest three perceptual cues based on multiple return characteristics, height deviation and spatial distribution, being expected as reliable perceptual cues for forest area detection from LIDAR data. We then classify the potential forest areas based on the individual cue and refine them with a bi-morphological process to eliminate falsely detected areas and smoothing the boundaries. The final refined forest areas have been compared with the reference data manually generated with an aerial image. All the methods based on three types of cues show the accuracy of more than 90%. Particularly, the method based on multiple returns is slightly better than other two cues in terms of the simplicity and accuracy. Also, it is shown that the combination of the individual results from each cue can enhance the classification accuracy.

Estimation of Carbon Absorption Distribution by Land Use Changes using RS/GIS Method in Green Land (RS/GIS를 이용한 토지이용변화에 의한 녹지의 이산화탄소 (CO2) 흡착량 분포 추정)

  • Na, Sang-Il;Park, Jong-Hwa;Park, Jin-Ki
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.52 no.3
    • /
    • pp.39-45
    • /
    • 2010
  • Quantification of carbon absorption and understanding the human induced land use changes (LUC) forms one of the major study with respect to global climatic changes. An attempt study has been made to quantify the carbon absorption by LUC through remote sensing technology. The Landsat imagery four time periods was classified with the hybrid classification method in order to quantify carbon absorption by LUC. Thereafter, for estimating the amount of carbon absorption, the stand biomass of forest was estimated with the total weight, which was the sum of individual tree weight. Individual tree volumes could be estimated with the crown width extracted from digital forest cover type map. In particular, the carbon conversion index and the ratio of the $CO_2$ molecular weight to the C atomic weight, reported in the IPCC guideline, was used to convert the stand biomass into the amount of carbon absorption. Total carbon absorption has been modeled by taking areal estimates of LUC of four time periods and carbon factors for land use type and standing biomass. Results of this study, through LUC suggests that over a period of construction, 7.10 % of forest and 9.43 % of barren were converted into urban. In the conversion process, there has been a loss of 6.66 t/ha/y (7.94 %) of carbon absorption from the study area.

Development and application of prediction model of hyperlipidemia using SVM and meta-learning algorithm (SVM과 meta-learning algorithm을 이용한 고지혈증 유병 예측모형 개발과 활용)

  • Lee, Seulki;Shin, Taeksoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.111-124
    • /
    • 2018
  • This study aims to develop a classification model for predicting the occurrence of hyperlipidemia, one of the chronic diseases. Prior studies applying data mining techniques for predicting disease can be classified into a model design study for predicting cardiovascular disease and a study comparing disease prediction research results. In the case of foreign literatures, studies predicting cardiovascular disease were predominant in predicting disease using data mining techniques. Although domestic studies were not much different from those of foreign countries, studies focusing on hypertension and diabetes were mainly conducted. Since hypertension and diabetes as well as chronic diseases, hyperlipidemia, are also of high importance, this study selected hyperlipidemia as the disease to be analyzed. We also developed a model for predicting hyperlipidemia using SVM and meta learning algorithms, which are already known to have excellent predictive power. In order to achieve the purpose of this study, we used data set from Korea Health Panel 2012. The Korean Health Panel produces basic data on the level of health expenditure, health level and health behavior, and has conducted an annual survey since 2008. In this study, 1,088 patients with hyperlipidemia were randomly selected from the hospitalized, outpatient, emergency, and chronic disease data of the Korean Health Panel in 2012, and 1,088 nonpatients were also randomly extracted. A total of 2,176 people were selected for the study. Three methods were used to select input variables for predicting hyperlipidemia. First, stepwise method was performed using logistic regression. Among the 17 variables, the categorical variables(except for length of smoking) are expressed as dummy variables, which are assumed to be separate variables on the basis of the reference group, and these variables were analyzed. Six variables (age, BMI, education level, marital status, smoking status, gender) excluding income level and smoking period were selected based on significance level 0.1. Second, C4.5 as a decision tree algorithm is used. The significant input variables were age, smoking status, and education level. Finally, C4.5 as a decision tree algorithm is used. In SVM, the input variables selected by genetic algorithms consisted of 6 variables such as age, marital status, education level, economic activity, smoking period, and physical activity status, and the input variables selected by genetic algorithms in artificial neural network consist of 3 variables such as age, marital status, and education level. Based on the selected parameters, we compared SVM, meta learning algorithm and other prediction models for hyperlipidemia patients, and compared the classification performances using TP rate and precision. The main results of the analysis are as follows. First, the accuracy of the SVM was 88.4% and the accuracy of the artificial neural network was 86.7%. Second, the accuracy of classification models using the selected input variables through stepwise method was slightly higher than that of classification models using the whole variables. Third, the precision of artificial neural network was higher than that of SVM when only three variables as input variables were selected by decision trees. As a result of classification models based on the input variables selected through the genetic algorithm, classification accuracy of SVM was 88.5% and that of artificial neural network was 87.9%. Finally, this study indicated that stacking as the meta learning algorithm proposed in this study, has the best performance when it uses the predicted outputs of SVM and MLP as input variables of SVM, which is a meta classifier. The purpose of this study was to predict hyperlipidemia, one of the representative chronic diseases. To do this, we used SVM and meta-learning algorithms, which is known to have high accuracy. As a result, the accuracy of classification of hyperlipidemia in the stacking as a meta learner was higher than other meta-learning algorithms. However, the predictive performance of the meta-learning algorithm proposed in this study is the same as that of SVM with the best performance (88.6%) among the single models. The limitations of this study are as follows. First, various variable selection methods were tried, but most variables used in the study were categorical dummy variables. In the case with a large number of categorical variables, the results may be different if continuous variables are used because the model can be better suited to categorical variables such as decision trees than general models such as neural networks. Despite these limitations, this study has significance in predicting hyperlipidemia with hybrid models such as met learning algorithms which have not been studied previously. It can be said that the result of improving the model accuracy by applying various variable selection techniques is meaningful. In addition, it is expected that our proposed model will be effective for the prevention and management of hyperlipidemia.

The Structure of Plant Community in Kwangnung Forest(II) - Analysis on the Forest Community in Mt. Jookyup by the Classification and Ordination Techniques - (광릉(光陵) 삼림(森林)의 식물군집구조(植物群集構造)(II) - Classification 및 Ordination방법에 의한 죽엽산지역(竹葉山地域)의 식생분석(植生分析) -)

  • Lee, Kyong Jae;Choi, Song Hyun;Jo, Jae Chang
    • Journal of Korean Society of Forest Science
    • /
    • v.81 no.3
    • /
    • pp.214-223
    • /
    • 1992
  • To investigate the structure of the plant community of Mt. Jookyup area in Kwangnung forest, thirty-seven plots were set up by the clumped sampling method. The classification by TWINSPAN and two kinds of multivariate ordination(RA, DCA) were applied to the study area in order to classify them into several groups based on woody plants and environmental variables. The classification have been successfully overlayed on an ordination of the same data using DCA. The plots can be classified into five groups by TWINSPAN and DCA. The successional trends of tree species by both techniques seem to be expected two ways in the canopy layer. The first is from Pinus densiflora to Carpinus laxiflora and the second is from Pinus densiflora through Quercus mongolica to Carpinus laxiflora. In the understory layer, it was expected that Rhododendron mucronulatum ${\rightarrow}$Lindera obtusiloba, Symplocos chinensis for. pilosa, Viburunum erasum, Styrax obassia${\rightarrow}$Euonymus sachalinensis, Sorbus alnifolia. As the result of the analysis for the relationship between the stand scores of DCA and environmental variables, they had a tendency to increase significantly from the P. densiflora community to Quercus spp. community that was soil pH, total nitrogen, available phosphate and exchangeable potassium, sodium, calcium and magnesium.

  • PDF

The Structure of Plant Community in Kwangnung Forest(I) -Analysis on the Forest Community of Soribong Area by the Classification and Ordination Techniques- (광릉(光陵) 삼림(森林)의 식물군집구조(植物群集構造)(I) -Classification 및 Ordination 방법에 의한 소리봉(蘇利峯)지역의 식생분석(植生分析)-)

  • Lee, Kyong Jae;Jo, Jae Chang;Lee, Bong Su;Lee, Do Suck
    • Journal of Korean Society of Forest Science
    • /
    • v.79 no.2
    • /
    • pp.173-186
    • /
    • 1990
  • To investigate the structure of the plant community of Soribong area in Kwangnung forest, forty-six plots were set up by the clumped sampling method. The classification by TWINSPAN and four kinds of multivariate ordination(PO, PCA, RA, DCA) were applied to the study area in order to classify them into several groups based on woody plants and environmental variables. The classification had been successfully overlayed on an ordination of the same data using DCA. The plots can be classified into four groups by TWINSPAN and DCA. The successional trends of tree species by both techniques seem to be from Pinus densiflora through Quercus mongolica, Q. serrata, Q. aliena, Carpinus laxiflora, Sorbus alnifolia to C. cordata, Fraxinus rhynchophylla, Cornus controversa in the canopy layer, and from Rhododendron mucronulatum, Rhus triohocarpa, Lespeoleza cyrtobotrya, Weigela subsessilis through Corylus sieboldiana, Lindera obtusiloba to Slaphylea bumalda, Callicarpa japonica, Lonicera maackii in the understory layer. As a result of the analysis for the relationship between the stand scores of DCA and environmental variables, they had a tendancy to increase significantly from the P. densiflora community to C. cordata community that was soil pH and the amount of humus, total nitrogen and exchangeable cations.

  • PDF

Evaluating the prediction models of leaf wetness duration for citrus orchards in Jeju, South Korea (제주 감귤 과수원에서의 이슬지속시간 예측 모델 평가)

  • Park, Jun Sang;Seo, Yun Am;Kim, Kyu Rang;Ha, Jong-Chul
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.20 no.3
    • /
    • pp.262-276
    • /
    • 2018
  • Models to predict Leaf Wetness Duration (LWD) were evaluated using the observed meteorological and dew data at the 11 citrus orchards in Jeju, South Korea from 2016 to 2017. The sensitivity and the prediction accuracy were evaluated with four models (i.e., Number of Hours of Relative Humidity (NHRH), Classification And Regression Tree/Stepwise Linear Discriminant (CART/SLD), Penman-Monteith (PM), Deep-learning Neural Network (DNN)). The sensitivity of models was evaluated with rainfall and seasonal changes. When the data in rainy days were excluded from the whole data set, the LWD models had smaller average error (Root Mean Square Error (RMSE) about 1.5hours). The seasonal error of the DNN model had the similar magnitude (RMSE about 3 hours) among all seasons excluding winter. The other models had the greatest error in summer (RMSE about 9.6 hours) and the lowest error in winter (RMSE about 3.3 hours). These models were also evaluated by the statistical error analysis method and the regression analysis method of mean squared deviation. The DNN model had the best performance by statistical error whereas the CART/SLD model had the worst prediction accuracy. The Mean Square Deviation (MSD) is a method of analyzing the linearity of a model with three components: squared bias (SB), nonunity slope (NU), and lack of correlation (LC). Better model performance was determined by lower SB and LC and higher NU. The results of MSD analysis indicated that the DNN model would provide the best performance and followed by the PM, the NHRH and the CART/SLD in order. This result suggested that the machine learning model would be useful to improve the accuracy of agricultural information using meteorological data.

A Study on the Feature Extraction Using Spectral Indices from WorldView-2 Satellite Image (WorldView-2 위성영상의 분광지수를 이용한 개체 추출 연구)

  • Hyejin, Kim;Yongil, Kim;Byungkil, Lee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.33 no.5
    • /
    • pp.363-371
    • /
    • 2015
  • Feature extraction is one of the main goals in many remote sensing analyses. After high-resolution imagery became more available, it became possible to extract more detailed and specific features. Thus, considerable image segmentation algorithms have been developed, because traditional pixel-based analysis proved insufficient for high-resolution imagery due to its inability to handle the internal variability of complex scenes. However, the individual segmentation method, which simply uses color layers, is limited in its ability to extract various target features with different spectral and shape characteristics. Spectral indices can be used to support effective feature extraction by helping to identify abundant surface materials. This study aims to evaluate a feature extraction method based on a segmentation technique with spectral indices. We tested the extraction of diverse target features-such as buildings, vegetation, water, and shadows from eight band WorldView-2 satellite image using decision tree classification and used the result to draw the appropriate spectral indices for each specific feature extraction. From the results, We identified that spectral band ratios can be applied to distinguish feature classes simply and effectively.

Ensemble of Nested Dichotomies for Activity Recognition Using Accelerometer Data on Smartphone (Ensemble of Nested Dichotomies 기법을 이용한 스마트폰 가속도 센서 데이터 기반의 동작 인지)

  • Ha, Eu Tteum;Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.123-132
    • /
    • 2013
  • As the smartphones are equipped with various sensors such as the accelerometer, GPS, gravity sensor, gyros, ambient light sensor, proximity sensor, and so on, there have been many research works on making use of these sensors to create valuable applications. Human activity recognition is one such application that is motivated by various welfare applications such as the support for the elderly, measurement of calorie consumption, analysis of lifestyles, analysis of exercise patterns, and so on. One of the challenges faced when using the smartphone sensors for activity recognition is that the number of sensors used should be minimized to save the battery power. When the number of sensors used are restricted, it is difficult to realize a highly accurate activity recognizer or a classifier because it is hard to distinguish between subtly different activities relying on only limited information. The difficulty gets especially severe when the number of different activity classes to be distinguished is very large. In this paper, we show that a fairly accurate classifier can be built that can distinguish ten different activities by using only a single sensor data, i.e., the smartphone accelerometer data. The approach that we take to dealing with this ten-class problem is to use the ensemble of nested dichotomy (END) method that transforms a multi-class problem into multiple two-class problems. END builds a committee of binary classifiers in a nested fashion using a binary tree. At the root of the binary tree, the set of all the classes are split into two subsets of classes by using a binary classifier. At a child node of the tree, a subset of classes is again split into two smaller subsets by using another binary classifier. Continuing in this way, we can obtain a binary tree where each leaf node contains a single class. This binary tree can be viewed as a nested dichotomy that can make multi-class predictions. Depending on how a set of classes are split into two subsets at each node, the final tree that we obtain can be different. Since there can be some classes that are correlated, a particular tree may perform better than the others. However, we can hardly identify the best tree without deep domain knowledge. The END method copes with this problem by building multiple dichotomy trees randomly during learning, and then combining the predictions made by each tree during classification. The END method is generally known to perform well even when the base learner is unable to model complex decision boundaries As the base classifier at each node of the dichotomy, we have used another ensemble classifier called the random forest. A random forest is built by repeatedly generating a decision tree each time with a different random subset of features using a bootstrap sample. By combining bagging with random feature subset selection, a random forest enjoys the advantage of having more diverse ensemble members than a simple bagging. As an overall result, our ensemble of nested dichotomy can actually be seen as a committee of committees of decision trees that can deal with a multi-class problem with high accuracy. The ten classes of activities that we distinguish in this paper are 'Sitting', 'Standing', 'Walking', 'Running', 'Walking Uphill', 'Walking Downhill', 'Running Uphill', 'Running Downhill', 'Falling', and 'Hobbling'. The features used for classifying these activities include not only the magnitude of acceleration vector at each time point but also the maximum, the minimum, and the standard deviation of vector magnitude within a time window of the last 2 seconds, etc. For experiments to compare the performance of END with those of other methods, the accelerometer data has been collected at every 0.1 second for 2 minutes for each activity from 5 volunteers. Among these 5,900 ($=5{\times}(60{\times}2-2)/0.1$) data collected for each activity (the data for the first 2 seconds are trashed because they do not have time window data), 4,700 have been used for training and the rest for testing. Although 'Walking Uphill' is often confused with some other similar activities, END has been found to classify all of the ten activities with a fairly high accuracy of 98.4%. On the other hand, the accuracies achieved by a decision tree, a k-nearest neighbor, and a one-versus-rest support vector machine have been observed as 97.6%, 96.5%, and 97.6%, respectively.

Classification of Micro-Landform on the Alluvial Plain Using Landsat TM Image: The Case of the Kum-ho River Basin Area (Landsat TM 영상(映像)을 이용한 충적평가(沖積平野) 미지형(微地形) 분류(分類) -금호강(琴湖江) 유역평야(流域平野)를 대상으로-)

  • Jo, Myung-Hee;Jo, Wha-Ryong
    • Journal of the Korean association of regional geographers
    • /
    • v.2 no.2
    • /
    • pp.197-204
    • /
    • 1996
  • We attempt to classifing method of micro-landform on the alluvial plain, such as natural-levee, backmarsh and alluvial fan, using false color composite of Landsat Thematic Mapper image. The study area is Kumho River Basin on the southeastern part of Korea peninsula. The most effective image for micro-landform classification is the false color composite of band 2, 3 and 4 with blue, green and red filtering. The most favorable time is the middle third of November, because of the density differentiation of green vegetation in most great. In this time the paddy field on the back-marsh is bare by rice harvesting. But on the natural levee the green vegetation, such as vegetables and lower herbs under fruit tree, remain relatively more. On the alluvial fan, the green vegetation condition is medium. For the verification of the micro-landform classification, we employed the field survey and grain size analysis of the deposition of each micro-landform on the sample area. It is clarified that the classification method of micro-landform on the alluvial plain using the Landsat TM image is relatively useful.

  • PDF

Group Classification on Management Behavior of Diabetic Mellitus (당뇨 환자의 관리행태에 대한 군집 분류)

  • Kang, Sung-Hong;Choi, Soon-Ho
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.2
    • /
    • pp.765-774
    • /
    • 2011
  • The purpose of this study is to provide informative statistics which can be used for effective Diabetes Management Programs. We collected and analyzed the data of 666 diabetic people who had participated in Korean National Health and Nutrition Examination Survey in 2007 and 2008. Group classification on management behavior of Diabetic Mellitus is based on the K-means clustering method. The Decision Tree method and Multiple Regression Analysis were used to study factors of the management behavior of Diabetic Mellitus. Diabetic people were largely classified into three categories: Health Behavior Program Group, Focused Management Program Group, and Complication Test Program Group. First, Health Behavior Program Group means that even though drug therapy and complication test are being well performed, people should still need to improve their health behavior such as exercising regularly and avoid drinking and smoking. Second, Focused Management Program Group means that they show an uncooperative attitude about treatment and complication test and also take a passive action to improve their health behavior. Third, Complication Test Program Group means that they take a positive attitude about treatment and improving their health behavior but they pay no attention to complication test to detect acute and chronic disease early. The main factor for group classification was to prove whether they have hyperlipidemia or not. This varied widely with an individual's gender, income, age, occupation, and self rated health. To improve the rate of diabetic management, specialized diabetic management programs should be applied depending on each group's character.