• 제목/요약/키워드: methods of data analysis

검색결과 19,673건 처리시간 0.058초

Current Status of Tree Height Estimation from Airborne LiDAR Data

  • Hwang, Se-Ran;Lee, Im-Pyeong
    • 대한원격탐사학회지
    • /
    • 제27권3호
    • /
    • pp.389-401
    • /
    • 2011
  • Most nations around the world have expressed significant concern in the climate change due to a rapid increase in green-house gases and thus reach an international agreement to control total amount of these gases for the mitigation of global warming. As the most important absorber of carbon dioxide, one of major green-house gases, forest resources should be more tightly managed with a means to measure their total amount, forest biomass, efficiently and accurately. Forest biomass has close relations with forest areas and tree height. Airborne LiDAR data helps extract biophysical properties on forest resources such as tree height more efficiently by providing detailed spatial information about the wide-range ground surface. Many researchers have thus developed various methods to estimate tree height using LiDAR data, which retain different performance and characteristics depending on forest environment and data characteristics. In this study, we attempted to investigate such various techniques to estimate tree height, elaborate their advantages and limitations, and suggest future research directions. We first examined the characteristics of LiDAR data applied to forest studies and then analyzed methods on filtering, a precedent procedure for tree height estimation. Regarding the methods for tree height estimation, we classified them into two categories: individual tree-based and regression-based method and described the representative methods under each category with a summary of their analysis results. Finally, we reviewed techniques regarding data fusion between LiDAR and other remote sensing data for future work.

Model- Data Based Small Area Estimation

  • Shin, Key-Il;Lee, Sang Eun
    • Communications for Statistical Applications and Methods
    • /
    • 제10권3호
    • /
    • pp.637-645
    • /
    • 2003
  • Small area estimation had been studied using data-based methods such as Direct, Indirect, Synthetic methods. However recently, model-based such as based on regression or time series estimation methods are applied to the study. In this paper we investigate a model-data based small area estimation which takes into account the spatial relation among the areas. The Economic Active Population Survey in 2001 are used for analysis and the results from the model based and model-data based estimation are compared with using MSE(Mean squared error), MAE(Mean absolute error) and MB(Mean bias).

A Study on K -Means Clustering

  • Bae, Wha-Soo;Roh, Se-Won
    • Communications for Statistical Applications and Methods
    • /
    • 제12권2호
    • /
    • pp.497-508
    • /
    • 2005
  • This paper aims at studying on K-means Clustering focusing on initialization which affect the clustering results in K-means cluster analysis. The four different methods(the MA method, the KA method, the Max-Min method and the Space Partition method) were compared and the clustering result shows that there were some differences among these methods, especially that the MA method sometimes leads to incorrect clustering due to the inappropriate initialization depending on the types of data and the Max-Min method is shown to be more effective than other methods especially when the data size is large.

Cluster Analysis of Car Parking Data, and Development of their Web Applications

  • Kubota, Takafumi;Hayashi, Takayuki;Tarumi, Tomoyuki
    • Communications for Statistical Applications and Methods
    • /
    • 제18권4호
    • /
    • pp.549-557
    • /
    • 2011
  • In this paper, we apply cluster analysis to "Okayama parking data" that is one of the spatial point patterns data that includes locations and the fare structure of car parking space in Okayama central area. This study classifies the characteristics of small areas through Okayama parking data as well as visualizes the results of the cluster analysis. We develop web applications that connect the results of a cluster analysis and overlay objects including points of balloons and rectangles of small areas over a map of Okayama central area.

Study of Mental Disorder Schizophrenia, based on Big Data

  • Hye-Sun Lee
    • International Journal of Advanced Culture Technology
    • /
    • 제11권4호
    • /
    • pp.279-285
    • /
    • 2023
  • This study provides academic implications by considering trends of domestic research regarding therapy for Mental disorder schizophrenia and psychosocial. For the analysis of this study, text mining with the use of R program and social network analysis method have been used and 65 papers have been collected The result of this study is as follows. First, collected data were visualized through analysis of keywords by using word cloud method. Second, keywords such as intervention, schizophrenia, research, patients, program, effect, society, mind, ability, function were recorded with highest frequency resulted from keyword frequency analysis. Third, LDA (latent Dirichlet allocation) topic modeling result showed that classified into 3 keywords: patient, subjects, intervention of psychosocial, efficacy of interventions. Fourth, the social network analysis results derived connectivity, closeness centrality, betweennes centrality. In conclusion, this study presents significant results as it provided basic rehabilitation data for schizophrenia and psychosocial therapy through new research methods by analyzing with big data method by proposing the results through visualization from seeking research trends of schizophrenia and psychosocial therapy through text mining and social network analysis.

자료동화 기법을 이용한 위성영상 추출 토양수분 자료 개선 (Improving Satellite Derived Soil Moisture Data Using Data Assimilation Methods)

  • Hwang, Soonho;Ryu, Jeong Hoon;Kang, Moon Seong
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2018년도 학술발표회
    • /
    • pp.152-152
    • /
    • 2018
  • Soil moisture is a important factor in hydrologic analysis. So, if we have spatially distributed soil moisture data, it can help to study much research in a various field. Recently, there are a lot of satellite derived soil moisture data, and it can be served through web freely. Especially, NASA (National Aeronautics and Space Administration) launched the Soil Moisture Aperture Passive (SMAP) satellite for mapping global soil moisture on 31 January 2015. SMAP data have many advantages for study, for example, SMAP data has higher spatial resolution than other satellited derived data. However, becuase many satellited derived soil moisture data have a limitation to data accuracy, if we have ancillary materials for improving data accuracy, it can be used. So, in this study, after applying the alogorithm, which is data assimilation methods, applicability of satellite derived soil moisture data was analyzed. Among the various data assimilation methods, in this study, Model Output Statistics (MOS) technique was used for improving satellite derived soil moisture data. Model Output Statistics (MOS) is a type of statistical post-processing, a class of techniques used to improve numerical weather models' ability to forecast by relating model outputs to observational or additional model data.

  • PDF

Application of Clustering Methods for Interpretation of Petroleum Spectra from Negative-Mode ESI FT-ICR MS

  • Yeo, In-Joon;Lee, Jae-Won;Kim, Sung-Hwan
    • Bulletin of the Korean Chemical Society
    • /
    • 제31권11호
    • /
    • pp.3151-3155
    • /
    • 2010
  • This study was performed to develop analytical methods to better understand the properties and reactivity of petroleum, which is a highly complex organic mixture, using high-resolution mass spectrometry and statistical analysis. Ten crude oil samples were analyzed using negative-mode electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry (ESI FT-ICR MS). Clustering methods, including principle component analysis (PCA), hierarchical clustering analysis (HCA), and k-means clustering, were used to comparatively interpret the spectra. All the methods were consistent and showed that oxygen and sulfur-containing heteroatom species played important roles in clustering samples or peaks. The oxygen-containing samples had higher acidity than the other samples, and the clustering results were linked to properties of the crude oils. This study demonstrated that clustering methods provide a simple and effective way to interpret complex petroleomic data.

Genetic classification of various familial relationships using the stacking ensemble machine learning approaches

  • Su Jin Jeong;Hyo-Jung Lee;Soong Deok Lee;Ji Eun Park;Jae Won Lee
    • Communications for Statistical Applications and Methods
    • /
    • 제31권3호
    • /
    • pp.279-289
    • /
    • 2024
  • Familial searching is a useful technique in a forensic investigation. Using genetic information, it is possible to identify individuals, determine familial relationships, and obtain racial/ethnic information. The total number of shared alleles (TNSA) and likelihood ratio (LR) methods have traditionally been used, and novel data-mining classification methods have recently been applied here as well. However, it is difficult to apply these methods to identify familial relationships above the third degree (e.g., uncle-nephew and first cousins). Therefore, we propose to apply a stacking ensemble machine learning algorithm to improve the accuracy of familial relationship identification. Using real data analysis, we obtain superior relationship identification results when applying meta-classifiers with a stacking algorithm rather than applying traditional TNSA or LR methods and data mining techniques.

Symbolic Cluster Analysis for Distribution Valued Dissimilarity

  • Matsui, Yusuke;Minami, Hiroyuki;Misuta, Masahiro
    • Communications for Statistical Applications and Methods
    • /
    • 제21권3호
    • /
    • pp.225-234
    • /
    • 2014
  • We propose a novel hierarchical clustering for distribution valued dissimilarities. Analysis of large and complex data has attracted significant interest. Symbolic Data Analysis (SDA) was proposed by Diday in 1980's, which provides a new framework for statistical analysis. In SDA, we analyze an object with internal variation, including an interval, a histogram and a distribution, called a symbolic object. In the study, we focus on a cluster analysis for distribution valued dissimilarities, one of the symbolic objects. A hierarchical clustering has two steps in general: find out step and update step. In the find out step, we find the nearest pair of clusters. We extend it for distribution valued dissimilarities, introducing a measure on their order relations. In the update step, dissimilarities between clusters are redefined by mixture of distributions with a mixing ratio. We show an actual example of the proposed method and a simulation study.

EXCEL을 이용한 다변량자료분석 시스템 개발 (A Development of Multivariate Analysis System by Using Excel)

  • 한상태;강현철;한정훈
    • 응용통계연구
    • /
    • 제17권1호
    • /
    • pp.165-172
    • /
    • 2004
  • 최근 다변량자료 분석과 관련하여 이를 시스템으로 구현하려는 연구가 다양한 각도로 이루어지고 있다. 이러한 연구들의 공통적인 특징은 일반 사용자들에게 고급 통계분석기법을 편리하게 활용할 수 있도록 GUI(Graphical User Interface) 환경의 시스템을 제공해 준 것이다. 이러한 연구들의 연장선상에서, 본 연구에서는 사회 각 분야에서 가장 널리 활용되고 있는 사무용 프로그램 인 Excel을 활용하여 시스템을 개발함으로써, 일반 사용자들도 대화식으로 다변량자료 분석을 쉽게 수행할 수 있도록 하였다.