• Title/Summary/Keyword: canonical forest

Search Result 48, Processing Time 0.03 seconds

GIR-based canonical forest: An ensemble method for imbalanced big data (불균형 데이터의 분류 성능 향상을 위한 일반화된 불균형 비율(GIR) 기반의 과소 표집 canonical forest (GC-Forest))

  • Solji Han;Jaesung Myung;Hyunjoong Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.5
    • /
    • pp.615-629
    • /
    • 2024
  • In the field of big data mining, the challenge of imbalanced classification problem has been actively researched for decades. While imbalanced data issues manifest in various forms, past research mainly focused on addressing sample size imbalance between classes. However, recent studies have revealed that rather than the imbalance in sample size alone, the degradation of classification performance significantly worsens when the class overlap is combined. In response, this study introduces GC-Forest (GIR-based canonical forest), an effective ensemble classification method that utilizes weighted resampling technique considering the degrees of overlap between classes. This method measures the imbalance ratio in terms of class overlap at each stage of ensemble and balances the classes by increasing the representativeness of the minority class. Additionally, to improve overall classification performance, the GC-Forest method adopts the canonical forest method as an ensemble classifier, which is designed to enhance both the performance and diversity of individual classifiers. The performance of the proposed method was compared and verified through experiments using 14 different types of real imbalanced data. GC-Forest showed very competitive classification performance in terms of AUC, PR-AUC, G-mean, and F1-score compared to 7 other ensemble methods.

Relationships between Soil-Site Properties and Bamboo (Phyllostachys bambusoides) Growth (토양(土壤)의 이화학적(理化學的) 특성(特性)과 대나무 생장(生長)과의 관계(關係))

  • Chung, Young Gwan;Ramm, Carl W.
    • Journal of Korean Society of Forest Science
    • /
    • v.79 no.1
    • /
    • pp.16-20
    • /
    • 1990
  • Canonical correlation analysis was used to relate 17 soil-site variables to bamboo diameter, height, and internodal characteristics. The first canonical correlation was highly significant, explained much of the variance in both sets of variables, and the canonical variates made sense biologically. Surface soil depth, total nitrogen and percent organic matter had high positive correlations with the first soil-site canonical variate. Clay content (%) and cation exchange capacity were negatively correlated with the first soil-site canonical variate. Only 8 of predictor variables were considered relevant for predicting bamboo growth.

  • PDF

Canonical Correspondence Analysis(CCA) on the Forest Vegetation of Mt. Togyu National Park, Korea (Canonical Correspondence Analysis(CCA)에 의한 덕유산 국립공원의 삼림식생분석)

  • 김창환;길봉섭
    • The Korean Journal of Ecology
    • /
    • v.20 no.2
    • /
    • pp.125-132
    • /
    • 1997
  • A study of forest vegetation in Mt. $T\v{o}kyu$ National Park was investigated by ordination technique. By TWINSPAN(Two-Way Indicator Species Analysis) method, 10 groups were recognized as follows: pinus densiflora, Quercus variabilis, Quercus serrata, Quercus mongolica-Rhododendron schlippenbachii, Quercus mongolica-Abies koreana, Quercus mongolica-Acer pseudo-sieboldi-amum, Quercus mongolica-Symplocos chinensis for. pilosa, Carpinus laxiflora, Fraxinus mandshurica and Taxus cuspidata groups. The floristic composition of these groups showed high correlation to soil moisture(r=0.831), altitude(r=0.784), topography(r=-0.722), organic matter(r=0.642), and pH(r=-0.509) among various environmental factors. According to the results of CCA(Canonical Correspondence Analysis) Pinus densiflora group and Quercus variabilis group were situated in a xeric area at a lower altitude where soil nutrients were poor compared with the other groups. Fraxinus mandshurica group was distributed throughout the valley with high soil moisture and good nutrients, Quercus serrata group and Carpinus laxiflora group were found in the low altitude region with good nutrients, Quercus mongolica group, at the high altitude region with good nutrients, and Quercus mongolica-Acer koreana and Taxus cuspidata at higher altitudes(1, 400-1600 m).

  • PDF

The Relationship between Characteristics of Forest Fires and Spatial Patterns of Forest Types by the Ecoregions of South Korea (한국의 생태지역별 산불특성과 임상분포패턴과의 관계)

  • Lee, Byungdoo;Song, Jungeun;Lee, Myungbo;Chung, Joosang
    • Journal of Korean Society of Forest Science
    • /
    • v.97 no.1
    • /
    • pp.1-9
    • /
    • 2008
  • It is necessary to examine relationship between spatial patterns of forest types and characteristics of forest fires for efficient management of fire and forest. By the ecoregions of South Korea, we computed landscape indices for whole types of forests(landscape level) and pine forests(class level), and analyzed characteristics of forest fires using statistics of forest fires from 1991 to 2006. We performed canonical correlation analysis to model the relationship between the landscape indices and the statistics of forest fires. At landscape level, forest patches were larger and more complex in the ecoregions which had higher percentage of forest area. At class level, pine forest patches were more complex and closer to neighbor patches in the coastal ecoregions. The ecoregions including metropolitan areas and cities had more frequent fire occurrences per 1,000ha, while mountainous coastal ecoregions had more burned areas and faster spread of fire growth rate. The canonical correlation between the landscape indices for pine forests and the statistics of forest fires was statistically significant at the 0.05 level and explained more than 70% of the variation in fire variables. The results showed that combustion time per fire was longer in the ecoregions which had larger and more aggregated pine forest patches.

Environmental Factors Affecting the Abundance and Presence of Tree Species in a Tropical Lowland Limestone and Non-limestone Forest in Ben En National Park, Vietnam

  • Nguyen, Thinh Van;Mitlohner, Ralph;Bich, Nguyen Van;Do, Tran Van
    • Journal of Forest and Environmental Science
    • /
    • v.31 no.3
    • /
    • pp.177-191
    • /
    • 2015
  • The effect of environmental variables on the presence and abundance of tree species in a tropical lowland undisturbed limestone and non-limestone forest in Ben En National Park, Vietnam was investigated. The relationships between 13 environmental variables and 29 tree species with a DBH ${\geq}10cm$, as well as between six 6 physical variables with 26 species of seedling and sapling communities were assessed by canonical correspondence analysis (CCA). Data concerning all tree species ${\geq}10cm$ DBH were collected from eighteen $400m^2$ sample plots, while the abundance of regeneration (all individuals ${\leq}5cm$ DBH) was counted in fifty $2{\times}20m$ strip-plots. The significance of species-environments correlations were tested by distribution-free Monte Carlo tests. The CCA of the 29 examined tree species and 13 environmental variables indicated that the presence and abundance of the tree species were closely related to topographic factors. We may confirm that soil properties including pH, soil moisture content, and soil textures, were the most crucial factor in tree species composition and their distribution. Several species including Pometia pinnata, Amesiodendron chinense, Gironniera cuspidate, Cinnamomum mairei, and Caryodaphnopsis tonkinensis were not controlled by soil properties and topographic variables. The CCA also indicated that the abundance of regeneration tree species at all sites had positive and significant correlations with soil depth, while the occurrence of several other tree species (such as Koilodepas longifolium and Aglaia dasyclada) was positively correlated with a higher slope and rocky outcrop.

Disturbance, Diversity, Regeneration and Composition in Temperate Forests of Western Himalaya, India

  • Tiwari, Om Prakash;Sharma, Chandra Mohan;Rana, Yashwant Singh;Krishan, Ram
    • Journal of Forest and Environmental Science
    • /
    • v.35 no.1
    • /
    • pp.6-24
    • /
    • 2019
  • We have investigated the impact of anthropogenic and natural disturbances on regeneration, composition and diversity in some temperate forests of Bhagirathi Catchment Area of Garhwal Himalaya. The forests were categorized on the basis of canopy cover and magnitude of disturbance into highly, moderately and least disturbed classes. The dominant tree species at lower elevation were Pinus roxburghii and Quercus leucotrichophora, while Abies pindrow, Q. semecarpifolia and Rhododenron arboreum were the dominant species at the upper elevational forests. Cythula tomentosa and Indegophera heterentha were the dominant shrub species present in all the forests. Similarly, Circium wallichii and Oxalis corniculata were the dominant herb species found in all forests (except Q. leucotrichophora forest), whereas Thalictrum foliolosum and Viola pilosa were noticed in each forest (except P. roxburghii forest). The tree density values oscillated between $400{\pm}10\;trees\;ha^{-1}$ to $750{\pm}89.1\;trees\;ha^{-1}$ which generally decreased from lower to higher disturbance regimes however, the total basal cover value was highest ($88.1{\pm}23.6m^2\;ha^{-1}$) in highly disturbed forest and lowest ($25.8{\pm}2.2m^2\;ha^{-1}$) in moderately disturbed forest. The shrub and herb densities were maximum in least disturbed forest, while the young regenerating individuals i.e., sapling and seedling were observed increasing from high to low disturbed forests which reflected that the forest fragmentation adversely affected the regeneration. However, A. pindrow and P. roxburghii were found invariably encroaching the habitats of R. arboreum and Q. leucotrichophora at various altitudes, respectively. The Canonical Correspondence Analysis clearly indicated that the elevation and lopping intensity have more impact on trees, while shrub and herbs were more influenced by elevation, canopy cover, light attenuation and soil erosion. Pinus roxburghii was the only species which was affected by heavy litter removal and forest fire.

Detrended canonical correspondence analysis and polar ordination analysis on the forest communities of mudungsan. (DCCA 와 Polar Ordination 에 依한 無等山의 森林 群落 分析)

  • Kim, Chang-Hwan;Kil, Bong-Seop
    • The Korean Journal of Ecology
    • /
    • v.15 no.2
    • /
    • pp.117-125
    • /
    • 1992
  • TWINSPAN(two-way indicator species analysis), DCCA(detrended canonical correspondence analysis) and polar ordination method wee used so as to analyze the the relation between forest vegetation and hibitat of mudungsan(1, 187m) located in kwangju area. Vegetationsurvey consulted 1:25, 000 topographical map, set up 41 quadrats and analyzed from April, 1990 to August, 1991. Forest vegetation of mudungsan was classifild to quercus acutissima community, fraxinus mandshurica community, quercus mongolica community, quercus serrata community, quercus dentata community, quercus variabilis community, and pinus densiflora community by TWINSPANmethod, and this almost coincide with the result of plar ordination. according to DCCA analysis, P. densiflora community was formed in xeric and low altitude region which soil nutrient was poor, compared with other communities. q. variabilis and q. acutissima community wee distributed in the region that low altitude and organic matter content was comparatively low, but q. acutissima community was formed in a damp region while q. variabilis community in a xeric region. q. mongolica and f. mandshurica formed the communities in a high altitude region, especially f. mandshurica cmmunity was distributed in a high humidity region. According to polar ordination analysis, the forest vegetation was classified to 7 communities by means of environmental gradient such as humidity, organic matter, ph, temperature, c.e.c and P2O5.

  • PDF

An Analysis of Vegetation-Environment Relationships of Mt. Gyeryong and Mt. Deokyu by Detrended Canonical Correspondence Analysis (DCCA에 의(依)한 계룡산(鷄龍山)과 덕유산(德裕山)의 삼림군집(森林群集)과 환경(環境)의 상관관계(相關關係) 분석(分析))

  • Song, Ho-Kyung
    • Journal of Korean Society of Forest Science
    • /
    • v.79 no.2
    • /
    • pp.216-221
    • /
    • 1990
  • Vegetational data from Mt. Gyeryong and Deokyu in central Korea were analysed in relation to 15 environmental variables. Two multivariate methods were applied : two-way indicator species analysis (TWINSPAN) for classification and detrended canonical correspondence analysis(DCCA), a recent technique which extracts ordination axes that can be related to environmental factors. The relationship between the distribution of dominant species of forest vegetation and soil condition in Mt. Gyeryong and Deokyu was investigated by analyzing elevation and soil nutrition gradient. Quercus mongolica forest was distributed in the high elevation and good nutrition area, Carpinzrs laxiflora and Fraxinus rhynclzophylla forest in the medium elevation and good nutrition area, Piszus densiflora-Quercus mongolica and Quercus variabilis forest in the medium elevation and medium nutrition area, Styrax jabozaica forest in the low elevation and medium nutrition area, and Pinus densiflora forest in the low elevation and poor nutrition area. The dominant compositional gradient related to elevation.

  • PDF

Ensemble model through mixed projections useful for big data analytics (투영 조합을 통한 빅데이터 앙상블 모형)

  • Hyejoon Park;Hyunjoong Kim;Yung-Seop Lee
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.5
    • /
    • pp.691-702
    • /
    • 2024
  • In this paper, we propose mixed projection forest (MPF), a new classification ensemble method that can be effectively applied in the field of big data analysis. When training individual classifiers within an ensemble, MPF uses oblique hyperplanes using combined rotation matrix derived from data projection techniques of principal component analysis (PCA) and canonical linear discriminant analysis (CLDA), thereby improving the accuracy of each classifier. Additionally, the diversity of individual classifiers is improved by generating various rotation matrices through random partitioning of the input variable set. This approach ultimately enhances classification performance and proves to be highly effective in big data analysis that demands precision. We conducted a performance comparison of MPF with existing classification ensemble models using 30 real or simulated datasets. The results indicate that MPF achieves competitive performance in terms of classification accuracy and classifier diversity.

Correlation Analysis between Forest Vegetation Type and Environment Factor in Mt. Hwaak (화악산의 산림군락과 환경요인의 상관관계 분석)

  • Yun, Chung-Weon;Kim, Hye-Jin;Yang, Hee-Moon;Lim, Jong-Hwan;Kim, Young-Kul;Shin, Joon-Hwan;Lee, Byeng-Cheon
    • Journal of Environmental Science International
    • /
    • v.18 no.5
    • /
    • pp.579-588
    • /
    • 2009
  • The purpose of this study was to explain relationship between community structure and their environment variables in Mt. Hwaak. Samples were collected by 101 plots using ZM phytosociological method and followed by cluster, importance value and canonical correspondence analysis. The forest vegetation classified into 8 community types such as Pinus densiflora community, Berberis amurensis community, Betula ermani community, Betula schmidtii community, Larix leptolepis community, Pinus koraiensis community, Cornus controversa community and Salix koreensis community. Altitude was considered as the highest factor correlated to the community types. Berberis amurensis community and Betula ermani community were located in upper slope area of high elevation, Comus controversa community and Salix koreensis community in valley area, and Pinus densiflora community in ridge area, respectively.