• 제목/요약/키워드: Multiple Principal Component Analysis

검색결과 159건 처리시간 0.024초

Projection spectral analysis: A unified approach to PCA and ICA with incremental learning

  • Kang, Hoon;Lee, Hyun Su
    • ETRI Journal
    • /
    • 제40권5호
    • /
    • pp.634-642
    • /
    • 2018
  • Projection spectral analysis is investigated and refined in this paper, in order to unify principal component analysis and independent component analysis. Singular value decomposition and spectral theorems are applied to nonsymmetric correlation or covariance matrices with multiplicities or singularities, where projections and nilpotents are obtained. Therefore, the suggested approach not only utilizes a sum-product of orthogonal projection operators and real distinct eigenvalues for squared singular values, but also reduces the dimension of correlation or covariance if there are multiple zero eigenvalues. Moreover, incremental learning strategies of projection spectral analysis are also suggested to improve the performance.

Probabilistic condition assessment of structures by multiple FE model identification considering measured data uncertainty

  • Kim, Hyun-Joong;Koh, Hyun-Moo
    • Smart Structures and Systems
    • /
    • 제15권3호
    • /
    • pp.751-767
    • /
    • 2015
  • A new procedure is proposed for assessing probabilistic condition of structures considering effect of measured data uncertainty. In this procedure, multiple Finite Element (FE) models are identified by using weighting vectors that represent the uncertainty conditions of measured data. The distribution of structural parameters is analysed using a Principal Component Analysis (PCA) in relation to uncertainty conditions, and the identified models are classified into groups according to their similarity by using a K-means method. The condition of a structure is then assessed probabilistically using FE models in the classified groups, each of which represents specific uncertainty condition of measured data. Yeondae bridge, a steel-box girder expressway bridge in Korea, is used as an illustrative example. Probabilistic condition of the bridge is evaluated by the distribution of load rating factors obtained using multiple FE models. The numerical example shows that the proposed method can quantify uncertainty of measured data and subsequently evaluate efficiently the probabilistic condition of bridges.

Multiple Group Testing Procedures for Analysis of High-Dimensional Genomic Data

  • Ko, Hyoseok;Kim, Kipoong;Sun, Hokeun
    • Genomics & Informatics
    • /
    • 제14권4호
    • /
    • pp.187-195
    • /
    • 2016
  • In genetic association studies with high-dimensional genomic data, multiple group testing procedures are often required in order to identify disease/trait-related genes or genetic regions, where multiple genetic sites or variants are located within the same gene or genetic region. However, statistical testing procedures based on an individual test suffer from multiple testing issues such as the control of family-wise error rate and dependent tests. Moreover, detecting only a few of genes associated with a phenotype outcome among tens of thousands of genes is of main interest in genetic association studies. In this reason regularization procedures, where a phenotype outcome regresses on all genomic markers and then regression coefficients are estimated based on a penalized likelihood, have been considered as a good alternative approach to analysis of high-dimensional genomic data. But, selection performance of regularization procedures has been rarely compared with that of statistical group testing procedures. In this article, we performed extensive simulation studies where commonly used group testing procedures such as principal component analysis, Hotelling's $T^2$ test, and permutation test are compared with group lasso (least absolute selection and shrinkage operator) in terms of true positive selection. Also, we applied all methods considered in simulation studies to identify genes associated with ovarian cancer from over 20,000 genetic sites generated from Illumina Infinium HumanMethylation27K Beadchip. We found a big discrepancy of selected genes between multiple group testing procedures and group lasso.

승용차 도심 주행패턴에 의한 연비 성능 분석 (A Study on the Fuel Economy based on the Driving Patterns for Passenger Car in the Metropolitan Area)

  • 정남훈;이우택;선우명호
    • 한국자동차공학회논문집
    • /
    • 제11권1호
    • /
    • pp.25-31
    • /
    • 2003
  • There are a lot of factors influencing on the automobile fuel economy such as average speed, average acceleration, acceleration sum per kilometer, and so on. In this study, various driving data were recorded during road tests. The accumulated road test mileage in Seoul metropolitan area is around 1,300 kilometers. The data were analyzed by multivariate statistical techniques including correlation analysis, principal component analysis, and multiple linear regression analysis. The analyzed results show that the average trip time per kilometer is one of the most important factors to fuel consumption and the increase of the average speed is desirable for reducing emissions and fuel consumption.

Principal Component and Multiple Regression Analysis for Steel Fiber Reinforced Concrete (SFRC) Beams

  • Islam, Mohammad S.;Alam, Shahria
    • International Journal of Concrete Structures and Materials
    • /
    • 제7권4호
    • /
    • pp.303-317
    • /
    • 2013
  • This study evaluates the shear strength of steel fiber reinforced concrete (SFRC) beams from a database, which consists of extensive experimental results of 222 SFRC beams having no stirrups. In order to predict the analytical shear strength of the SFRC beams more precisely, the selected beams were sorted into six different groups based on their ultimate concrete strength (low strength with $f_c^{\prime}$ <50 MPa and high strength with $f_c^{\prime}$ <50 MPa), span-depth ratio (shallow beam with $a/d{\geq}2.5 $and deep beam with a/d<2.5) and steel fiber shape (plain, crimped and hooked). Principal component and multiple regression analyses were performed to determine the most feasible model in predicting the shear strength of SFRC beams. A variety of statistical analyses were conducted, and compared with those of the existing equations in estimating the shear strength of SFRC beams. The results showed that the recommended empirical equations were best suited to assess the shear strength of SFRC beams more accurately as compared to those obtained by the previously developed models.

유사색 모집단을 이용한 개선된 분광 반사율 추정 (Advanced surface spectral-reflectance estimation using a population with similar colors)

  • 이철희;김태호;류명춘;오주환
    • 한국산업정보학회:학술대회논문집
    • /
    • 한국산업정보학회 2001년도 춘계학술대회논문집:21세기 신지식정보의 창출
    • /
    • pp.280-287
    • /
    • 2001
  • The studies to estimate the surface spectral reflectance of an object have received widespread attention using the multi-spectral camera system. However, the multi-spectral camera system requires the additional color filter according to increment of the channel and system complexity is increased by multiple capture. Thus, this paper proposes an algorithm to reduce the estimation error of surface spectral reflectance with the conventional 3-band RGB camera. In the proposed method, adaptive principal components for each pixel are calculated by renewing the population of surface reflectances and the adaptive principal components can reduce estimation error of surface spectral reflectance of current pixel. To evacuate performance of the proposed estimation method, 3-band principal component analysis, 5-band wiener estimation method, and the proposed method are compared in the estimation experiment with the Macbeth ColorChecker. As a result, the proposed method showed a lower mean square ems between the estimated and the measured spectra compared to the conventional 3-band principal component analysis method and represented a similar or advanced estimation performance compared to the 5-band wiener method.

  • PDF

Virtual Coverage: A New Approach to Coverage-Based Software Reliability Engineering

  • Park, Joong-Yang;Lee, Gyemin
    • Communications for Statistical Applications and Methods
    • /
    • 제20권6호
    • /
    • pp.467-474
    • /
    • 2013
  • It is common to measure multiple coverage metrics during software testing. Software reliability growth models and coverage growth functions have been applied to each coverage metric to evaluate software reliability; however, analysis results for the individual coverage metrics may conflict with each other. This paper proposes the virtual coverage metric of a normalized first principal component in order to avoid conflicting cases. The use of the virtual coverage metric causes a negligible loss of information.

Assessment of water quality variations under non-rainy and rainy conditions by principal component analysis techniques in Lake Doam watershed, Korea

  • Bhattrai, Bal Dev;Kwak, Sungjin;Heo, Woomyung
    • Journal of Ecology and Environment
    • /
    • 제38권2호
    • /
    • pp.145-156
    • /
    • 2015
  • This study was based on water quality data of the Lake Doam watershed, monitored from 2010 to 2013 at eight different sites with multiple physiochemical parameters. The dataset was divided into two sub-datasets, namely, non-rainy and rainy. Principal component analysis (PCA) and factor analysis (FA) techniques were applied to evaluate seasonal correlations of water quality parameters and extract the most significant parameters influencing stream water quality. The first five principal components identified by PCA techniques explained greater than 80% of the total variance for both datasets. PCA and FA results indicated that total nitrogen, nitrate nitrogen, total phosphorus, and dissolved inorganic phosphorus were the most significant parameters under the non-rainy condition. This indicates that organic and inorganic pollutants loads in the streams can be related to discharges from point sources (domestic discharges) and non-point sources (agriculture, forest) of pollution. During the rainy period, turbidity, suspended solids, nitrate nitrogen, and dissolved inorganic phosphorus were identified as the most significant parameters. Physical parameters, suspended solids, and turbidity, are related to soil erosion and runoff from the basin. Organic and inorganic pollutants during the rainy period can be linked to decayed matters, manure, and inorganic fertilizers used in farming. Thus, the results of this study suggest that principal component analysis techniques are useful for analysis and interpretation of data and identification of pollution factors, which are valuable for understanding seasonal variations in water quality for effective management.

두 층 관측 기상인자의 주성분-다중회귀분석으로 도출되는 고농도 미세먼지의 부산-서울 지역차이 해석 (Interpretation and Comparison of High PM2.5 Characteristics in Seoul and Busan based on the PCA/MLR Statistics from Two Level Meteorological Observations)

  • 최다니엘;장임석;김철희
    • 대기
    • /
    • 제31권1호
    • /
    • pp.29-43
    • /
    • 2021
  • In this study, two-step statistical approach including Principal Component Analysis (PCA) and Multiple Linear Regression (MLR) was employed, and main meteorological factors explaining the high-PM2.5 episodes were identified in two regions: Seoul and Busan. We first performed PCA to isolate the Principal Component (PC) that is linear combination of the meteorological variables observed at two levels: surface and 850 hPa level. The employed variables at surface are: temperature (T2m), wind speed, sea level pressure, south-north and west-east wind component and those at 850 hPa upper level variables are: south-north (v850) and west-east (u850) wind component and vertical stability. Secondly we carried out MLR analysis and verified the relationships between PM2.5 daily mean concentration and meteorological PCs. Our two-step statistical approach revealed that in Seoul, dominant factors for influencing the high PM2.5 days are mainly composed of upper wind characteristics in winter including positive u850 and negative v850, indicating that continental (or Siberian) anticyclone had a strong influence. In Busan, however, the dominant factors in explanaining in high PM2.5 concentrations were associated with high T2m and negative u850 in summer. This is suggesting that marine anticyclone had a considerable effect on Busan's high PM2.5 with high temperature which is relevant to the vigorous photochemical secondary generation. Our results of both differences and similarities between two regions derived from only statistical approaches imply the high-PM2.5 episodes in Korea show their own unique characteristics and seasonality which are mostly explainable by two layer (surface and upper) mesoscale meteorological variables.

다특성 파라미터설계 방법의 비교 연구 (A Comparison of Parameter Design Methods for Multiple Performance Characteristics)

  • 소우진;염봉진
    • 대한산업공학회지
    • /
    • 제38권3호
    • /
    • pp.198-207
    • /
    • 2012
  • In product or process parameter design, the case of multiple performance characteristics appears more commonly than that of a single characteristic. Numerous methods have been developed to deal with such multi-characteristic parameter design (MCPD) problems. Among these, this paper considers three representative methods, which are respectively based on the desirability function (DF), grey relational analysis (GRA), and principal component analysis (PCA). These three methods are then used to solve the MCPD problems in ten case studies reported in the literature. The performance of each method is evaluated for various combinations of its algorithmic parameters and alternatives. Relative performances of the three methods are then compared in terms of the significance of a design parameter and the overall performance value corresponding to the compromise optimal design condition identified by each method. Although no method is significantly inferior to others for the data sets considered, the GRA-based and PCA-based methods perform slightly better than the DF-based method. Besides, for the PCA-based method, the compromise optimal design condition depends much on which alternative is adopted while, for the GRA-based method, it is almost independent of the algorithmic parameter, and therefore, the difficulty involved in selecting an appropriate algorithmic parameter value can be alleviated.