• Title/Summary/Keyword: factor analysis(PCA: principal component analysis)

Search Result 89, Processing Time 0.022 seconds

Principal Component Analysis on Marine Casualties Occurred at Korean Littoral Sea in Recent 5 Years (최근 5년간 국내 연근해에서 발생한 해양사고에 대한 주성분분석)

  • KIM, Yeong-Sik
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.28 no.2
    • /
    • pp.465-472
    • /
    • 2016
  • Principal Component Analysis (PCA) is useful statistical technique for finding patterns in data, and expressing the data in such a way as to highlight their similarities and differences. In this paper, 1417 marine casualties occurred in Korean littoral sea in recent 5 years, were examined by the PCA. The main results obtained were as follows : 1. Most of marine casualties resulted from the human factors such as careless operation and insufficient engine maintenance. 2. Collision and standing mainly resulted from steering room-related human factors such as careless guard, inadequate ship-handling, however engine damage and fire explosion mainly resulted from engine room-related human factor such as bad handling of engine system. 3. No. 1 principal component represents accident frequency, No. 2 principal component represents the cause and No. 3 principal component represents the pattern of marine casualties, respectively.

Evaluation of the Geum River by Multivariate Analysis: Principal Component Analysis and Factor Analysis (다변량분석법을 이용한 금강 유역의 수질오염특성 연구)

  • Kim, Mi-Ah;Lee, Jae-kwan;Zoh, Kyung-Duk
    • Journal of Korean Society on Water Environment
    • /
    • v.23 no.1
    • /
    • pp.161-168
    • /
    • 2007
  • The main aim of this work is focus on the Geum river water quality evaluation of pollution data obtained by monitoring measurement during the period 2001-2005. The complex data matrix 19 (entire monitoring stations)*13 (parameters), 60 (month)*13 (parameters) and 20 (season)*13 (parameters) were treated with different multivariate techniques such as factor analysis/principal component analysis (FA/PCA). FA/PCA identified two factor (19*13) classified pollutant Loading factor (BOD, COD, pH, Cond, T-N, T-P, $NH_3$-N, $NO_3$-N, $PO_4$-P, Chl-a), seasonal factor (water temp, SS) and three Factor (60*13, 20*13) classified pollutant Loading factor (BOD, COD, Cond, T-N, T-P, $NH_3$-N, $NO_3$-N, $PO_4$-P), seasonal factor (water temp, SS) and metabolic factor (Chl-a, pH). Loadings of pollutant factor is potent influence main factor in the Geum river which is explained by loadings of pollutant factor at whole sampling stations (71.16%), month (52.75%) and season (56.57%) of main water quality stations. Result of this study is that pollutant loading factor is affected at Gongju 1, 2, Buyeo 1, 2, Gangkyeong, Yeongi stations by entire stations and entire month (Gongju 1, Cheongwon stations), April, May, July and August (buyeo 1) by month. Also the pollutant Loading factor is season gives an influence in winter (Gongju 1, buyeo 1) from main sampling stations, but Cheongwon characteristic is non-seasonal influenced. This study presents necessity and usefulness of multivariate statistic techniques for evaluation and interpretation of large complex data set with a view to get better information data effective management of water sources.

Comparison of hydrochemical informations of groundwater obtained from two different underground storage systems

  • Lee, Jeonghoon;Kim, Jun-Mo;Chang, Ho-Wan
    • Proceedings of the Korean Society of Soil and Groundwater Environment Conference
    • /
    • 2002.04a
    • /
    • pp.110-113
    • /
    • 2002
  • Statistical- based, principal component analysis (PCA) was applied to chemical data from two underground storage systems containing LPG to assess the usefulness of such technique at the initial stage (Pyeongtaek) or middle stage (Ulsan) of hydrochemical studies. For the first case, both natural and anthropogenic contamination characterize regional groundwater. Saline water buffered by Namyang lake affects as a natural factor, whereas cement grouting influence as an artificial factor. For the second study area, contaminations due to operation of LPG caverns, such as disinfection activity and cement grouting effect, deteriorate groundwater quality. This study indicates that principal component analysis would be particularly useful for summarizing large data set for the purpose of subsurface characterization, assessing their vulnerability to contamination and protecting recharge zones.

  • PDF

Assessment of water quality variations under non-rainy and rainy conditions by principal component analysis techniques in Lake Doam watershed, Korea

  • Bhattrai, Bal Dev;Kwak, Sungjin;Heo, Woomyung
    • Journal of Ecology and Environment
    • /
    • v.38 no.2
    • /
    • pp.145-156
    • /
    • 2015
  • This study was based on water quality data of the Lake Doam watershed, monitored from 2010 to 2013 at eight different sites with multiple physiochemical parameters. The dataset was divided into two sub-datasets, namely, non-rainy and rainy. Principal component analysis (PCA) and factor analysis (FA) techniques were applied to evaluate seasonal correlations of water quality parameters and extract the most significant parameters influencing stream water quality. The first five principal components identified by PCA techniques explained greater than 80% of the total variance for both datasets. PCA and FA results indicated that total nitrogen, nitrate nitrogen, total phosphorus, and dissolved inorganic phosphorus were the most significant parameters under the non-rainy condition. This indicates that organic and inorganic pollutants loads in the streams can be related to discharges from point sources (domestic discharges) and non-point sources (agriculture, forest) of pollution. During the rainy period, turbidity, suspended solids, nitrate nitrogen, and dissolved inorganic phosphorus were identified as the most significant parameters. Physical parameters, suspended solids, and turbidity, are related to soil erosion and runoff from the basin. Organic and inorganic pollutants during the rainy period can be linked to decayed matters, manure, and inorganic fertilizers used in farming. Thus, the results of this study suggest that principal component analysis techniques are useful for analysis and interpretation of data and identification of pollution factors, which are valuable for understanding seasonal variations in water quality for effective management.

Factor Analysis for Exploratory Research in the Distribution Science Field (유통과학분야에서 탐색적 연구를 위한 요인분석)

  • Yim, Myung-Seong
    • Journal of Distribution Science
    • /
    • v.13 no.9
    • /
    • pp.103-112
    • /
    • 2015
  • Purpose - This paper aims to provide a step-by-step approach to factor analytic procedures, such as principal component analysis (PCA) and exploratory factor analysis (EFA), and to offer a guideline for factor analysis. Authors have argued that the results of PCA and EFA are substantially similar. Additionally, they assert that PCA is a more appropriate technique for factor analysis because PCA produces easily interpreted results that are likely to be the basis of better decisions. For these reasons, many researchers have used PCA as a technique instead of EFA. However, these techniques are clearly different. PCA should be used for data reduction. On the other hand, EFA has been tailored to identify any underlying factor structure, a set of measured variables that cause the manifest variables to covary. Thus, it is needed for a guideline and for procedures to use in factor analysis. To date, however, these two techniques have been indiscriminately misused. Research design, data, and methodology - This research conducted a literature review. For this, we summarized the meaningful and consistent arguments and drew up guidelines and suggested procedures for rigorous EFA. Results - PCA can be used instead of common factor analysis when all measured variables have high communality. However, common factor analysis is recommended for EFA. First, researchers should evaluate the sample size and check for sampling adequacy before conducting factor analysis. If these conditions are not satisfied, then the next steps cannot be followed. Sample size must be at least 100 with communality above 0.5 and a minimum subject to item ratio of at least 5:1, with a minimum of five items in EFA. Next, Bartlett's sphericity test and the Kaiser-Mayer-Olkin (KMO) measure should be assessed for sampling adequacy. The chi-square value for Bartlett's test should be significant. In addition, a KMO of more than 0.8 is recommended. The next step is to conduct a factor analysis. The analysis is composed of three stages. The first stage determines a rotation technique. Generally, ML or PAF will suggest to researchers the best results. Selection of one of the two techniques heavily hinges on data normality. ML requires normally distributed data; on the other hand, PAF does not. The second step is associated with determining the number of factors to retain in the EFA. The best way to determine the number of factors to retain is to apply three methods including eigenvalues greater than 1.0, the scree plot test, and the variance extracted. The last step is to select one of two rotation methods: orthogonal or oblique. If the research suggests some variables that are correlated to each other, then the oblique method should be selected for factor rotation because the method assumes all factors are correlated in the research. If not, the orthogonal method is possible for factor rotation. Conclusions - Recommendations are offered for the best factor analytic practice for empirical research.

A Robust Principal Component Neural Network

  • Changha Hwang;Park, Hyejung;A, Eunyoung-N
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.625-632
    • /
    • 2001
  • Principal component analysis(PCA) is a multivariate technique falling under the general title of factor analysis. The purpose of PCA is to Identify the dependence structure behind a multivariate stochastic observation In order to obtain a compact description of it. In engineering field PCA is utilized mainly (or data compression and restoration. In this paper we propose a new robust Hebbian algorithm for robust PCA. This algorithm is based on a hyperbolic tangent function due to Hampel ef al.(1989) which is known to be robust in Statistics. We do two experiments to investigate the performance of the new robust Hebbian learning algorithm for robust PCA.

  • PDF

Novel assessment method of heavy metal pollution in surface water: A case study of Yangping River in Lingbao City, China

  • Liu, Yingran;Yu, Hongming;Sun, Yu;Chen, Juan
    • Environmental Engineering Research
    • /
    • v.22 no.1
    • /
    • pp.31-39
    • /
    • 2017
  • The primary purpose of this research is to understand those elements that define heavy metals contamination and to propose a novel assessment method based on principal component analysis (PCA) in the Yangping River region of Lingbao City, China. This paper makes detailed calculations regarding such factors the single-factor assessment ($P_i$) and Nemerow's multi-factor index ($P_N$) of heavy metals found in the surface water of the Yangping River. The maximum values of $P_i$ (Cd) and $P_i$ (Pb) were determined to be 892.000 and 113.800 respectively. The maximum value of $P_N$ was calculated to be 639.836. The results of Pearson's correlation analysis, hierarchical cluster analysis, and PCA indicated heavy metal groupings as follows: Cu, Pb, Zn and As, Hg, Cd. The PCA-based pollution index ($P_{an}$) of samplings was subsequently calculated. The relative coefficient square was valued at 0.996 between $P_{an}$ and $P_N$, which indicated that $P_{an}$ is able to serve as a new heavy metal pollution index; not only this index able to eliminate the influence of the maximum value of $P_i$, but further, this index contains the principal component elements needed to evaluate heavy metal pollution levels.

A Classification Method Using Data Reduction

  • Uhm, Daiho;Jun, Sung-Hae;Lee, Seung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.1
    • /
    • pp.1-5
    • /
    • 2012
  • Data reduction has been used widely in data mining for convenient analysis. Principal component analysis (PCA) and factor analysis (FA) methods are popular techniques. The PCA and FA reduce the number of variables to avoid the curse of dimensionality. The curse of dimensionality is to increase the computing time exponentially in proportion to the number of variables. So, many methods have been published for dimension reduction. Also, data augmentation is another approach to analyze data efficiently. Support vector machine (SVM) algorithm is a representative technique for dimension augmentation. The SVM maps original data to a feature space with high dimension to get the optimal decision plane. Both data reduction and augmentation have been used to solve diverse problems in data analysis. In this paper, we compare the strengths and weaknesses of dimension reduction and augmentation for classification and propose a classification method using data reduction for classification. We will carry out experiments for comparative studies to verify the performance of this research.

Demension reduction for high-dimensional data via mixtures of common factor analyzers-an application to tumor classification

  • Baek, Jang-Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.3
    • /
    • pp.751-759
    • /
    • 2008
  • Mixtures of factor analyzers(MFA) is useful to model the distribution of high-dimensional data on much lower dimensional space where the number of observations is very large relative to their dimension. Mixtures of common factor analyzers(MCFA) can reduce further the number of parameters in the specification of the component covariance matrices as the number of classes is not small. Moreover, the factor scores of MCFA can be displayed in low-dimensional space to distinguish the groups. We propose the factor scores of MCFA as new low-dimensional features for classification of high-dimensional data. Compared with the conventional dimension reduction methods such as principal component analysis(PCA) and canonical covariates(CV), the proposed factor score was shown to have higher correct classification rates for three real data sets when it was used in parametric and nonparametric classifiers.

  • PDF

Assessment of Water Quality using Multivariate Statistical Techniques: A Case Study of the Nakdong River Basin, Korea

  • Park, Seongmook;Kazama, Futaba;Lee, Shunhwa
    • Environmental Engineering Research
    • /
    • v.19 no.3
    • /
    • pp.197-203
    • /
    • 2014
  • This study estimated spatial and seasonal variation of water quality to understand characteristics of Nakdong river basin, Korea. All together 11 parameters (discharge, water temperature, dissolved oxygen, 5-day biochemical oxygen demand, chemical oxygen demand, pH, suspended solids, electrical conductivity, total nitrogen, total phosphorus, and total organic carbon) at 22 different sites for the period of 2003-2011 were analyzed using multivariate statistical techniques (cluster analysis, principal component analysis and factor analysis). Hierarchical cluster analysis grouped whole river basin into three zones, i.e., relatively less polluted (LP), medium polluted (MP) and highly polluted (HP) based on similarity of water quality characteristics. The results of factor analysis/principal component analysis explained up to 83.0%, 81.7% and 82.7% of total variance in water quality data of LP, MP, and HP zones, respectively. The rotated components of PCA obtained from factor analysis indicate that the parameters responsible for water quality variations were mainly related to discharge and total pollution loads (non-point pollution source) in LP, MP and HP areas; organic and nutrient pollution in LP and HP zones; and temperature, DO and TN in LP zone. This study demonstrates the usefulness of multivariate statistical techniques for analysis and interpretation of multi-parameter, multi-location and multi-year data sets.