• Title/Summary/Keyword: 다변량통계기법

Search Result 132, Processing Time 0.023 seconds

Multivariate pHd analysis (다변량 pHd 분석)

  • 이용구
    • The Korean Journal of Applied Statistics
    • /
    • v.8 no.1
    • /
    • pp.61-74
    • /
    • 1995
  • These days, many kinds of graphical methods have been developed, and it is possible to get information directly from data. Especially, R-code (Cook and Weisberg, 1994) make it possible to draw various kinds of two and three dimensional plots, and to rotate the axis of the plots. But the maximum dimensional of the plot is three, so we can not draw plot of one response variable with more than three explanatory variables. Li(1991, 1992) has developed a method to reduce the dimension of the explanatory variables, so it is possible to draw lower dimensional plots to get information of the full explanatory variables. One of the dimension reduction method developed by Li is pHd. In this paper, we have tried to apply the pHd method for the model with multivariate response.

  • PDF

Application of Multivariate Statistical Techniques to Analyze the Pollution Characteristics of Major Tributaries of the Nakdong River (낙동강 주요 지류의 오염특성 분석을 위한 다변량 통계기법의 적용)

  • Park, Jaebeom;Kal, Byungseok;Kim, Seongmin
    • Journal of Wetlands Research
    • /
    • v.21 no.3
    • /
    • pp.215-223
    • /
    • 2019
  • In this study, we analyzed the water quality characteristics of major tributaries of Nakdong River through statistical analysis such as correlation analysis, principal component and factor analysis, and cluster analysis. Organic matter and nutrients are highly correlated, and are high in spring and autumn, and seasonal water quality management is required. Principal component and factor analysis showed that 82% of total variance could be explained by 4 principal components such as organic matter, nutrients, nature, and weather. BOD, COD, TOC, and TP items were analyzed as major influencing factors. As a result of the cluster analysis, the four clusters were classified according to seasonal organic matter and nutrient pollution. Kumho River watershed showed high pollution characteristics in all seasons. Therefore, effective management of water quality in tributary streams requires measures in consideration of spatio-temporal characteristics and multivariate statistical techniques may be useful in water quality management and policy formulation.

Hydrogeochemical Characterization of Groundwater in Jeju Island using Principal Component Analysis and Geostatistics (주성분분석과 지구통계법을 이용한 제주도 지하수의 수리지화학 특성 연구)

  • Ko Kyung-Seok;Kim Yongie;Koh Dong-Chan;Lee Kwang-Sik;Lee Seung-Gu;Kang Cheol-Hee;Seong Hyun-Jeong;Park Won-Bae
    • Economic and Environmental Geology
    • /
    • v.38 no.4 s.173
    • /
    • pp.435-450
    • /
    • 2005
  • The purpose of the study is to analyze the hydrogeochemical characteristics by multivariate statistical method, to interpret the hydrogeochemical processes for the new variables calculated from principal components analysis (PCA), and to infer the groundwater flow and circulation mechanism by applying the geostatistical methods for each element and principal component. Chloride and nitrate are the most influencing components for groundwater quality, and the contents of $NO_3$ increased by the input of agricultural activities show the largest variation. The results of PCA, a multivariate statistical method, show that the first three principal components explain $73.9\%$ of the total variance. PC1 indicates the increase of dissolved ions, PC2 is related with the dissolution of carbonate minerals and nitrate contamination, and PC3 shows the effect of cation exchange process and silicate mineral dissolution. From the results of experimental semivariogram, the components of groundwater are divided into two groups: one group includes electrical conductivity (EC), Cl, Na, and $NO_3$, and the other includes $HCO_3,\;SiO_2,$ Ca, and Sr. The results for spatial distribution of groundwater components showed that EC, Cl, and Na increased with approaching the coastal line and nitrate has close relationship with the presence of agricultural land. These components are also correlated with the topographic features reflecting the groundwater recharge effect. The kriging analysis by using principal components shows that PC 1 has the different spatial distribution of Cl, Na, and EC, possibly due to the influence of pH, Ca, Sr, and $HCO_3$ for PC1. It was considered that the linear anomaly zone of PC2 in western area was caused by the dissolution of carbonate mineral. Consequently, the application of multivariate and geostatistical methods for groundwater in the study area is very useful for determining the quantitative analysis of water quality data and the characteristics of spatial distribution.

The Technology for On-line Measurement of Coal Properties by using Near-Infrared (근적외선을 이용한 온라인 석탄 성상분석 방법)

  • Kim, Dong-Won;Lee, Jong-Min;Kim, Jae-Sung;Kim, Hak-Jong
    • Korean Chemical Engineering Research
    • /
    • v.45 no.6
    • /
    • pp.596-603
    • /
    • 2007
  • Rapid or on-line coal analysis is of great interest in coal industry as it would allow efficient plant operation. Multivariate analysis has been applied to near-infrared(NIR) spectra coal for investigating the relationship between coal properties(%) (moisture, ash, volatile matter, fixed carbon, carbon, hydrogen, nitrogen, oxygen, sulfur), heating value(kcal/kg) and corresponding near-infrared spectral data. The quantitative analysis was carried out by applying PLS(partial least squares regression) to determine a methodology able to establish a relationship between coal properties and NIR spectral data being applied mathematical pre-treatments for minimizing the physical features of the samples. As a results of the analysis, this technique is able to classify the species of coals and to predict the all coal properties except ash, nitrogen and sulfur. The efficient operation of coal fired power plant is expected owing to real time on-line coal analysis of moisture and heating value.

Parameter Regionalization of Semi-Distributed Runoff Model Using Multivariate Statistical Analysis (다변량 통계분석을 이용한 준분포형 유출모형 매개변수 지역화)

  • Lee, Byong-Ju;Jung, Il-Won;Bae, Deg-Hyo
    • Journal of Korea Water Resources Association
    • /
    • v.42 no.2
    • /
    • pp.149-160
    • /
    • 2009
  • The objective of this study is to suggest parameter regionalization scheme which is integrated two multivariate statistical methods: principal components analysis(PCA) and hierarchical cluster analysis(HCA). This technique is to apply semi-distributed rainfall-runoff model on ungauged catchments. 7 catchment characteristics (area, mean altitude, mean slope, ratio of forest, water content at saturation, field capacity and wilting point) are estimated for 109 mid-sized sub-basins. The first two components from PCA results account for 82.11% of the total variance in the dataset. Component 1 is related to the location of the catchments relevant to the altitude and Component 2 is connected with the area of these. 103 ungauged catchments are clustered using HCA as the following 6 groups: Goesan 23, Andong 6, Imha 5, Hapcheon 21, Yongdam 4, Seomjin 44. SWAT model is used to simulate runoff and the parameters of the model on the 6 gauged basins are estimated. The model parameters were regionalized for Soyang, Chungju and Daecheong dam basins which are assumed as ungauged ones. The model efficiency coefficients of the simulated inflows for these three dams were at least 0.8. These results also mean that goodness of fit is high to the observed inflows. This research will contribute to estimate and analyze hydrologic components on the ungauged catchments.

Penalized least distance estimator in the multivariate regression model (다변량 선형회귀모형의 벌점화 최소거리추정에 관한 연구)

  • Jungmin Shin;Jongkyeong Kang;Sungwan Bang
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.1-12
    • /
    • 2024
  • In many real-world data, multiple response variables are often dependent on the same set of explanatory variables. In particular, if several response variables are correlated with each other, simultaneous estimation considering the correlation between response variables might be more effective way than individual analysis by each response variable. In this multivariate regression analysis, least distance estimator (LDE) can estimate the regression coefficients simultaneously to minimize the distance between each training data and the estimates in a multidimensional Euclidean space. It provides a robustness for the outliers as well. In this paper, we examine the least distance estimation method in multivariate linear regression analysis, and furthermore, we present the penalized least distance estimator (PLDE) for efficient variable selection. The LDE technique applied with the adaptive group LASSO penalty term (AGLDE) is proposed in this study which can reflect the correlation between response variables in the model and can efficiently select variables according to the importance of explanatory variables. The validity of the proposed method was confirmed through simulations and real data analysis.

A Study on Constuct of Value-Added Productivity Structure Model using Multivariate Statistical Method (다변량통계기법을 이용한 부가가치생산성 구조모델의 구상에 관한 연구)

  • 이영찬;조성훈;김태성
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.19 no.38
    • /
    • pp.117-129
    • /
    • 1996
  • This Study intends to analysis what 3 factors, which are indices of Capital, Labor and Distribution, really affect to Value-Added Productivity through Statistical Analysis. For this, We selected 12 indices of Value-Added from the edition of 'Annual report of Korean companies' published in 'Korea Investors Service., Inc', especially in parts of Chemicals and Chemical products of total 85 companies. Using this data, Multivariate Statistical Analysis such as Principal Component Analysis, Factor Analysis, Covariance Structure Analysis is taken for modeling the effect of 3 factor(Labor Productivity, Capital Productivity and the Index of Distribution) on Value-Added Productivity.

  • PDF

Extract the main factors related to ground subsidence near abandoned underground coal mine using PCA (PCA 기법을 이용한 폐탄광 지역의 지반침하 관련 요인 추출)

  • Choi, Jong-Kuk;Kim, Ki-Dong
    • Proceedings of the KSRS Conference
    • /
    • 2007.03a
    • /
    • pp.301-304
    • /
    • 2007
  • 본 연구에서는 폐탄광 지역에서 발생하는 지반침하에 영향을 주는 주요 요인들을 추출하기 위하여 다변량 통계분석 방법의 하나인 주성분분석(Principle Component Analysis : PCA)기법과 지리정보시스템 (Geographic Information System : GIS)을 이용하였다. 이를 위해 연구지역에서 수행한 지표지질조사, 정밀조사, 실내암석시험 등으로부터 취득된 자료를 데이터베이스로 구축하고, 지반침하 위험지역 분포를 공간적으로 해석할 수 있는 지질, 토지이용, 경사도, 지표로부터 지하 갱도까지의 심도, 갱도의 지표상 위치로부터의 수평거리, 지하수심도, 투수계수, RMR(Rock Mass Rating) 값을 분석대상으로 선정하였다. 각 요인들이 연구지역 전체에 걸쳐 분포하도록 GIS의 공간분석 기법의 하나인 표면분석(Surface Analysis), 버퍼링기법(Buffering) 및 내삽법(Interpolation)을 이용하여 래스터 데이터베이스로 구축하고 이로부터 추출된 자료들을 입력값으로 하는 주성분분석을 수행하였다. 주성분분석 결과 폐탄광 지역의 지반침하에 영향을 주는 주요인을 추출하는 것이 가능하였으며, 연구지역은 지질 및 지반강도 관련 요인이 침하발생의 가장 큰 요인인 것으로 분석되었다.

  • PDF

Multivariate Stratification Method for the Multipurpose Sample Survey : A Case Study of the Sample Design for Fisher Production Survey (다목적 표본조사를 위한 다변량 층화 : 어업비계통생산량조사를 위한 표본설계 사례)

  • Park, Jin-Woo;Kim, Young-Won;Lee, Seok-Hoon;Shin, Ji-Eun
    • Survey Research
    • /
    • v.9 no.1
    • /
    • pp.69-85
    • /
    • 2008
  • Stratification is a feature of the majority of field sample design. This paper considers the multivariate stratification strategy for multipurpose sample survey with several auxiliary variables. In a multipurpose survey, stratification procedure is very complicated because we have to simultaneously consider the efficiencies of stratification for several variables of interest. We propose stratification strategy based on factor analysis and cluster analysis using several stratification variables. To improve the efficiency of stratification, we first select the stratification variables by factor analysis, and then apply the K-means clustering algorithm to the formation of strata. An application of the stratification strategy in the sampling design for the Fisher Production Survey is discussed, and it turns out that the variances of estimators are significantly less than those obtained by simple random sampling.

  • PDF

A Study on the Network Generation Methods for Visualizing Knowledge Domain's Intellectual Structure (지적 구조의 시각화를 위한 네트워크 형성 방식에 관한 연구)

  • Lee, Jae-Yun
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2005.08a
    • /
    • pp.349-356
    • /
    • 2005
  • 지적 구조 분석을 위해서 계량서지적 자료를 시각적으로 표현하는 다양한 네트워크 형성 방식에 대해서 사례와 함께 각각의 특성을 살펴보았다. 동시인용을 비롯한 계량서지적 자료의 시각적인 표현은 지적구조의 분석에 있어서 매우 효과적인 방법으로 인식되어왔다. 시각화를 위해서는 전통적으로 다차원척도법이나 군집분석, 대응일치분석 등의 다변량통계 분석 기법이 사용되어왔다. 최근 들어 패스파인더 네트워크를 비롯한 새로운 네트워크 형성 기법이 전통적인 기법의 대체 도구로 제시되고 있으나, 사실상 네트워크의 형태로 지적 구조를 분석하는 것은 인용분석 연구의 초기부터라고 할 수 있다. 다양한 네트워크 형성 방식의 특성에 대해서 살펴봄으로써 계량서지적 분석을 활성화하는데 도움이 되리라고 기대한다.

  • PDF