• Title/Summary/Keyword: Principal component analysis(PCA)

Search Result 1,236, Processing Time 0.031 seconds

Assessment of Water Quality using Multivariate Statistical Techniques: A Case Study of the Nakdong River Basin, Korea

  • Park, Seongmook;Kazama, Futaba;Lee, Shunhwa
    • Environmental Engineering Research
    • /
    • v.19 no.3
    • /
    • pp.197-203
    • /
    • 2014
  • This study estimated spatial and seasonal variation of water quality to understand characteristics of Nakdong river basin, Korea. All together 11 parameters (discharge, water temperature, dissolved oxygen, 5-day biochemical oxygen demand, chemical oxygen demand, pH, suspended solids, electrical conductivity, total nitrogen, total phosphorus, and total organic carbon) at 22 different sites for the period of 2003-2011 were analyzed using multivariate statistical techniques (cluster analysis, principal component analysis and factor analysis). Hierarchical cluster analysis grouped whole river basin into three zones, i.e., relatively less polluted (LP), medium polluted (MP) and highly polluted (HP) based on similarity of water quality characteristics. The results of factor analysis/principal component analysis explained up to 83.0%, 81.7% and 82.7% of total variance in water quality data of LP, MP, and HP zones, respectively. The rotated components of PCA obtained from factor analysis indicate that the parameters responsible for water quality variations were mainly related to discharge and total pollution loads (non-point pollution source) in LP, MP and HP areas; organic and nutrient pollution in LP and HP zones; and temperature, DO and TN in LP zone. This study demonstrates the usefulness of multivariate statistical techniques for analysis and interpretation of multi-parameter, multi-location and multi-year data sets.

Multi-temporal Remote Sensing Data Analysis using Principal Component Analysis (주성분분석을 이용한 다중시기 원격탐사 자료분석)

  • Jeong, Jong-Chul
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.2 no.3
    • /
    • pp.71-80
    • /
    • 1999
  • The aim of the present study is to define and tentatively to interpret the distribution of polluted water released from Lake Sihwa into the Yellow Sea using Landsat TM. Since the region is an extreme Case 2 water, empirical algorithms for detecting concentration of chlorophyll-a and suspended sediments have limitations. This work focuses on the use of multi-temporal Landsat TM data. We applied PCA to detect evolution of spatial feature of polluted water after release from the lake Sihwa. The PCA results were compared with in situ data, such as chlorophyll-a, suspended sediments, Secchi disk depth(SDD), surface temperature, remote sensing reflectance at six channel of SeaWiFS. Also, the in situ remote sensing reflectance obtained by PRR-600(Profiling Reflectance Radiometer) was compared with PCA results of Landsat TM data sets to find good correlation between first Principal Component and Secchi disk depth($R^2$=0.7631), although other variables did not result in such a good correlation. Therefore, Problems in applying PCA techniques to multi-spectral remotely sensed data were also discussed in this paper.

  • PDF

Hotelling T2 Index Based PCA Method for Fault Detection in Transient State Processes (과도상태에서의 고장검출을 위한 Hotelling T2 Index 기반의 PCA 기법)

  • Asghar, Furqan;Talha, Muhammad;Kim, Se-Yoon;Kim, SungHo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.22 no.4
    • /
    • pp.276-280
    • /
    • 2016
  • Due to the increasing interest in safety and consistent product quality over a past few decades, demand for effective quality monitoring and safe operation in the modern industry has propelled research into statistical based fault detection and diagnosis methods. This paper describes the application of Hotelling $T^2$ index based Principal Component Analysis (PCA) method for fault detection and diagnosis in industrial processes. Multivariate statistical process control techniques are now widely used for performance monitoring and fault detection. Conventional methods such as PCA are suitable only for steady state processes. These conventional projection methods causes false alarms or missing data for the systems with transient values of processes. These issues significantly compromise the reliability of the monitoring systems. In this paper, a reliable method is used to overcome false alarms occur due to varying process conditions and missing data problems in transient states. This monitoring method is implemented and validated experimentally along with matlab. Experimental results proved the credibility of this fault detection method for both the steady state and transient operations.

Attack Detection and Classification Method Using PCA and LightGBM in MQTT-based IoT Environment (MQTT 기반 IoT 환경에서의 PCA와 LightGBM을 이용한 공격 탐지 및 분류 방안)

  • Lee Ji Gu;Lee Soo Jin;Kim Young Won
    • Convergence Security Journal
    • /
    • v.22 no.4
    • /
    • pp.17-24
    • /
    • 2022
  • Recently, machine learning-based cyber attack detection and classification research has been actively conducted, achieving a high level of detection accuracy. However, low-spec IoT devices and large-scale network traffic make it difficult to apply machine learning-based detection models in IoT environment. Therefore, In this paper, we propose an efficient IoT attack detection and classification method through PCA(Principal Component Analysis) and LightGBM(Light Gradient Boosting Model) using datasets collected in a MQTT(Message Queuing Telementry Transport) IoT protocol environment that is also used in the defense field. As a result of the experiment, even though the original dataset was reduced to about 15%, the performance was almost similar to that of the original. It also showed the best performance in comparative evaluation with the four dimensional reduction techniques selected in this paper.

Dimensionality Reduction of RNA-Seq Data

  • Al-Turaiki, Isra
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.3
    • /
    • pp.31-36
    • /
    • 2021
  • RNA sequencing (RNA-Seq) is a technology that facilitates transcriptome analysis using next-generation sequencing (NSG) tools. Information on the quantity and sequences of RNA is vital to relate our genomes to functional protein expression. RNA-Seq data are characterized as being high-dimensional in that the number of variables (i.e., transcripts) far exceeds the number of observations (e.g., experiments). Given the wide range of dimensionality reduction techniques, it is not clear which is best for RNA-Seq data analysis. In this paper, we study the effect of three dimensionality reduction techniques to improve the classification of the RNA-Seq dataset. In particular, we use PCA, SVD, and SOM to obtain a reduced feature space. We built nine classification models for a cancer dataset and compared their performance. Our experimental results indicate that better classification performance is obtained with PCA and SOM. Overall, the combinations PCA+KNN, SOM+RF, and SOM+KNN produce preferred results.

Automatic Machine Fault Diagnosis System using Discrete Wavelet Transform and Machine Learning

  • Lee, Kyeong-Min;Vununu, Caleb;Moon, Kwang-Seok;Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.8
    • /
    • pp.1299-1311
    • /
    • 2017
  • Sounds based machine fault diagnosis recovers all the studies that aim to detect automatically faults or damages on machines using the sounds emitted by these machines. Conventional methods that use mathematical models have been found inaccurate because of the complexity of the industry machinery systems and the obvious existence of nonlinear factors such as noises. Therefore, any fault diagnosis issue can be treated as a pattern recognition problem. We present here an automatic fault diagnosis system of hand drills using discrete wavelet transform (DWT) and pattern recognition techniques such as principal component analysis (PCA) and artificial neural networks (ANN). The diagnosis system consists of three steps. Because of the presence of many noisy patterns in our signals, we first conduct a filtering analysis based on DWT. Second, the wavelet coefficients of the filtered signals are extracted as our features for the pattern recognition part. Third, PCA is performed over the wavelet coefficients in order to reduce the dimensionality of the feature vectors. Finally, the very first principal components are used as the inputs of an ANN based classifier to detect the wear on the drills. The results show that the proposed DWT-PCA-ANN method can be used for the sounds based automated diagnosis system.

Estimation of Weights in Water Management Resilience Index Using Principal Component Analysis(PCA) (주성분 분석(PCA)을 이용한 물관리 탄력성 지수의 가중치 산정)

  • Park, Jung Eun;Lim, Kwang Suop;Lee, Eul Rae
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2016.05a
    • /
    • pp.583-583
    • /
    • 2016
  • 다양한 평가지표가 반영된 복합 지수(Composite Index)는 물관리 정책의 우선순위 결정 및 정책성과의 모니터링에 유용한 도구로 사용되고 있다. 각 지표별 중요도를 나타내는 가중치는 최종 지수의 산정에 영향을 미칠 수 있으며, 그 결정방법도 Data Envelopment Analysis(DEA), Benefit of doubt Approach(BOD), Unobserved Component Model(UCM), Budget Allocation Process(BAP), Analytic Hierarchy Process(AHP), Conjoint Analysis(CA) 등 다양하다. 본 연구에서는 여러 가지 가중치 결정방법 중 통계적 방법인 주성분 분석(Principal Component Analysis, PCA)을 사용하여 Park et al.(2016)이 제시한 물관리 탄력성 지수(Water Management Resilience Index, WMRI)에 대한 가중치를 산정하여 동일 가중치를 적용한 기존 결과와 비교하였다. 물관리 탄력성 지수는 자연조건상 물관리 취약성(Vulnerability), 기존 수자원 인프라의 견고성(Robustness), 물위기 적응전략의 다양성(Redundancy)의 3가지 부지수(sub-index)는 각각 13개, 11개, 7개의 지표(Indicator)로 구성되어 있으며, 117개 중권역을 다목적댐 하류 본류유역(범주 1), 용수공급 및 유량조절이 불가능한 지류(범주 2)와 가능한 지류(범주 3)로 분류하여 적용되었다. 각 부지수별로 추출된 3개, 5개, 3개의 주성분이 전체 자료의 76.4%, 71.2%, 63.2%를 설명하는 것으로 분석되었으며 부지수별 주성분의 고유벡터(Eigenvector)와 고유값(Eigenvalue)를 계산하고 각 지표의 가중치를 산정하였다. 주성분 분석에 의한 가중치와 동일 가중치를 적용하였을 경우와 비교해보면 취약성 부지수 1.9%, 견고성 부지수 1.9%, 다양성 부지수 2.1%의 차이가 나타나며 물관리 탄력성 지수는 0.4%의 차이를 보임에 따라 Park et al.이 제시한 연구결과의 적정성을 확인할 수 있었다. 주성분 분석은 객관적인 가중치 설정을 위한 통계적 접근방법의 하나로써 다양한 물관리 정책지수 산정시 활용될 수 있을 것이며, 향후 다른 가중치 산정방법을 적용함으로써 각 방법에 따른 지수 결과의 민감도 및 장단점을 분석할 수 있을 것으로 판단된다.

  • PDF

A Study on Fault Detection Monitoring and Diagnosis System of CNG Stations based on Principal Component Analysis(PCA) (주성분분석(PCA) 기법에 기반한 CNG 충전소의 이상감지 모니터링 및 진단 시스템 연구)

  • Lee, Kijun;Lee, Bong Woo;Choi, Dong-Hwang;Kim, Tae-Ok;Shin, Dongil
    • Journal of the Korean Institute of Gas
    • /
    • v.18 no.3
    • /
    • pp.53-59
    • /
    • 2014
  • In this study, we suggest a system to build the monitoring model for compressed natural gas (CNG) stations, operated in only non-stationary modes, and perform the real-time monitoring and the abnormality diagnosis using principal component analysis (PCA) that is suitable for processing large amounts of multi-dimensional data among multivariate statistical analysis methods. We build the model by the calculation of the new characteristic variables, called as the major components, finding the factors representing the trend of process operation, or a combination of variables among 7 pressure sensor data and 5 temperature sensor data collected from a CNG station at every second. The real-time monitoring is performed reflecting the data of process operation measured in real-time against the built model. As a result of conducting the test of monitoring in order to improve the accuracy of the system and verification, all data in the normal operation were distinguished as normal. The cause of abnormality could be refined, when abnormality was detected successfully, by tracking the variables out of the score plot.

The Flower Morphological Characteristics of Salix caprea×Salix gracilistyla

  • Seo, Han-Na;Chae, Seung-Beom;Lim, Hyo-In;Cho, Wonwoo;Lee, Wi-Young
    • Journal of Forest and Environmental Science
    • /
    • v.37 no.1
    • /
    • pp.35-43
    • /
    • 2021
  • The interspecific hybrid of Salix caprea and Salix gracilistyla has never been identified or studied in Korea. Accordingly, this study investigated the flower morphological characteristics of the interspecific hybrid between S. caprea and S. gracilistyla and compared the interspecific hybrid with S. caprea and S. gracilistyla, respectively. The female flowers were investigated for 12 characteristics and the male flowers were investigated for nine. For the female flowers, those of the hybrids were larger than those of S. caprea and S. gracilistyla in terms of catkin length (CL), bract length (BL), and bract width (BW). The hybrids are intermediates between S. caprea and S. gracilistyla in terms of ovary length, width, and stipitate length as well as gland length (GL). For the male flowers, those of the hybrids were bigger than those of S. caprea and S. gracilistyla in terms of CL, BL, and BW. The hybrids are intermediates between S. caprea and S. gracilistyla in terms of catkin width and stamen length (SL). A principal component analysis (PCA) of the female data showed that the first principal component (PC) explained 57.5% of the total variation. The first PC highly correlated the ovary stipitate and pistil style lengths. The analysis was divided into three groups of S. caprea, S. gracilistyla, and the hybrid by the first PC. The results of a PCA of the male data showed that the first PC explained 35.7% of the total variation. The first PC highly correlated with the adelphous SL and was divided into three groups of S. caprea, S. gracilistyla, and the hybrid. The results of the discriminant analysis showed that S. caprea, S. gracilistyla, and the hybrid were distinguishable by flower morphological characteristics. Therefore, the hybrid was distinctly separated from S. caprea and S. gracilistyla by flower characteristics.

DWT-PCA Combination for Noise Detection in Wireless Sensor Networks (무선 센서 네트워크에서 노이즈 감지를 위한 DWT-PCA 조합)

  • Dang, Thien-Binh;Le, Duc-Tai;Kim, Moonseong;Choo, Hyunseung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.144-146
    • /
    • 2020
  • Discrete Wavelet Transform (DWT) is an effective technique that is commonly used for detecting noise in collected data of an individual sensor. In addition, the detection accuracy can be significant improved by exploiting the correlation in the data of neighboring sensors of Wireless Sensor Networks (WSNs). Principal component analysis is the powerful technique to analyze the correlation in the multivariate data. In this paper, we propose a DWT-PCA combination scheme for noise detection (DWT-PCA-ND). Experimental results on a real dataset show a remarkably higher performance of DWT-PCA-ND comparing to conventional PCA scheme in detection of noise that is a popular anomaly in collected data of WSN.